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Nanooptics, which describes the interaction of light with matter at the nanoscale, is a topic 
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of quantum mechanical phenomena in action. This self-contained and extensively refer- 
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chemistry, electrical engineering, and materials science. Presenting an extensive theoretical 
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knowledge required to master the material is carefully introduced, with detailed deriva- 
tions and frequent worked examples allowing readers to gain a thorough understanding 
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approximations and simplifying assumptions often used to make such problems tractable 
while representative of the observed features. 
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Preface 


I regarded as quite useless the reading of large treatises of pure analysis: too large a 
number of methods pass at once before the eyes. It is in the works of application that 
one must study them; one judges their utility there and appraises the manner of making 
use of them. 

Joseph Louis Lagrange 


The use of the prefix “nano,” standing for a billionth (107°) of a base unit, has become 
quite prevalent in recent years. This is mainly due to the enormous success of the micro- 
electronics industry in reducing the size of transistors inside a computer chip (the smallest 
feature size has reached below 5 nm), leading to the advent of smartphones in 2007. The 
words nanoscience and nanotechnology have also become common over the last 20 years. 
It is commonly understood that nanoscience refers to the science of nanoscale devices, 
while nanotechnology makes use of such devices for practical applications. Contrary to 
our day-to-day experience, nanoscale devices behave quite differently. In this book, we 
build a tool kit useful for designing, controlling, and understanding the operation of such 
devices. 

Understanding of nature is based on simplified models with varying degrees of com- 
plexity that can be easily manipulated and observed. Our motivation behind this book is 
to show that such models of nanoscale quantum devices can be used to understand their 
operation and to optimize their performance. We achieve this goal by adopting quantum 
mechanics for the description of nanoscale devices. The quantum effects often dominate 
in such devices and must be included for their realistic description. Quantum mechanics is 
extraordinarily successful, and its predictions have been confirmed to an astonishing level 
of precision in a broad spectrum of experiments. Even though it has a mystique axiomatic 
foundation, it provides an accurate mapping between the real-world devices and theoretical 
models used for them. As one delves deeper, it becomes clear that mathematics plays an 
integral bridging role. Thus, we do not shy away from introducing sophisticated mathe- 
matical tools, but provide examples and sufficient details for understanding the material in 
a manner that we have not found in other texts. It is our firm belief that the complexity of 
underlying theoretical tools should not be a barrier for graduate students and scientists to 
model nanoscale devices accurately. 

This book is an attempt to bring widely spread diverse theoretical tools under a single 
volume that should prove useful for both graduate students and scientists or engineers 
interested in using nanoscale quantum devices for various applications. Many books cover 
distinct areas of nanoscience and nanotechnology separately. This book is intended not to 
replace them but to complement them. The book content is the tool kit that authors use in 


Preface 


their day-to-day research work as active researchers in the field of nanophotonics. We hope 
that our exposition to quantum systems will illuminate and inspire the reader to become 
proficient in designing and analyzing novel nanoscale devices with diverse applications. 

The book is organized as follows. Its first chapter provides historical introduction to the 
field of nanoscience and introduces features such as quantum confinement, quantum inter- 
ference, and quantum transport that become relevant for nanoscale devices. We review 
in Chapter 2 the quantum-mechanical concepts and mathematical tools needed for later 
chapters. Linear response theory is discussed in Chapter 3, where we also calculate the 
generalized susceptibility and introduce the fluctuation-dissipation theorem. The effects of 
the environment that lead to dissipation and decoherence in a quantum device are covered 
in Chapter 4, where we use a master-equation approach to derive the Lindblad and Red- 
field equations. The focus of Chapter 5 is on the flow of current inside quantum devices. 
We first use a simple approach based on scattering matrices and later apply the nonequi- 
librium Green’s function method to calculate the current. The phenomenon of quantum 
tunneling is covered in Chapter 6, where we calculate the tunneling current using two dif- 
ferent methods. Chapter 7 is devoted to quantum noise. After discussing important noise 
sources leading to thermal noise, shot noise, and Brownian motion, we introduce the quan- 
tum Langevin equations and apply them to lasers to calculate the noise spectra associated 
with the intensity and phase fluctuations. The concept of a squeezed state is also covered 
in this chapter. 

This monograph should serve well the needs of the nanoscience community interested 
in modeling and understanding the operation of nanoscale quantum devices. The potential 
readership is likely to consist of graduate students enrolled in MS and PhD programs and 
scientists working in fields such as nanophotonics and plasmonics. The book may also 
be useful for a high-level graduate course devoted to nanoscale devices. The extensive 
bibliography at the end of this book should also be helpful to readers. 

It is a pleasure to acknowledge the help we have received from various sources in writ- 
ing this book. Many individuals have contributed to the completion of this book, either 
directly or indirectly. We are thankful to all of them, especially to our graduate students 
and postdoctoral fellows, whose curiosity helped us in understanding better the material 
presented in this book. Particular mention is due to Nicholas Gibbons of Cambridge Uni- 
versity Press, who helped us steer this project to completion. No authors could wish for a 
more supportive editor, and we thank him, Sarah Lambert, Henry Cockburn, and rest of the 
CUP team. Last but not least, we thank Erosha and Anne for their love, encouragement, 
and support. 
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The principles of physics, as far as I can see, do not speak against the possibility of 
maneuvering things atom by atom. It is not an attempt to violate any laws; it is some- 
thing, in principle, that can be done; but in practice, it has not been done because we 
are too big. 

Richard Feynman 


1.1 Nanoscience and Nanotechnology 
Coo 


The widely used International System of Units (the SI, short for Système International) 
was adopted in 1889. It is based on seven base units — second (s), meter (m), kilogram 
(kg), ampere (A), kelvin (K), mole (mol), and candela (cd) — for measuring time, length, 
mass, electric current, temperature, amount of substance, and luminous intensity, respec- 
tively. Prefixes representing integer powers of 10 are added to these base units to produce 
multiples and submultiples of the original unit. The SI system also specifies that Latin 
terms should be used for negative powers of 10 — e.g., milli (m), micro (u), nano (n) — and 
Greek terms should be used for positive powers of 10 — e.g., kilo (k), mega (M), giga (G). 
The prefix nano was adopted in 1958 to precisely mean 107° SI units. According to the 
Oxford English Dictionary, the word nano originates from the classical Latin nanus, or its 
ancient Greek etymon nanos (vavoc), meaning dwarf [1]. 

In 1974, Norio Taniguchi introduced the term nanotechnology to describe his work on 
ultrafine machining and its potential for engineering devices at a submicrometer scale [2]. 
The modern usage of this term extends well beyond this simple machine metaphor and 
corresponds to a transformational technology capable of assembling, manipulating, and 
controlling individual atoms, molecules, or their interactions on a nanometer scale (1 to 
100 nm). Even though this usage captures the essence of present-day nanotechnology, it is 
based on the size of objects involved and thus has many deficiencies. International Orga- 
nization for Standardization (ISO), for example, has proposed to broaden the scope by 
including materials having at least one internal or surface feature, where the onset of size- 
dependent phenomena differs from the properties of individual atoms and molecules. Such 
structures enable novel applications and lead to improved materials, devices, and systems 
by exploiting nanoscale properties. 

Nanoscience can be described as the science of nanoscale devices. Essentially, one can 
consider nanoscience as the bridge between classical and quantum physics — it is a scale 
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where we can make use of both aspects to harness collective rather than the individ- 
ual properties of atoms and molecules. As we shall see later, these collective properties 
of individual building blocks predominantly define the novel aspects of nanostructures. 
Figure 1.1 shows a variety of objects covering length scales ranging from 0.1 nm to 1 cm. 
An expanded view of a few nanoscale (1 to 100 nm) objects involved in the development 
of nanotechnology is shown on the right side of this figure. 


1.1.1 Historical Perspective 
Historically, James Clerk Maxwell proposed in 1867 the use of tiny machines to violate 


the second law of thermodynamics, which states that entropy of a closed system cannot 
decrease. According to this law, heat must flow from hot to cold, and a perpetual motion 
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machine cannot be built. The gedanken experiment he proposed is known as Maxwell’s 
demon and involves a machine (or demon) guarding a tiny hole between two gas reservoirs 
at the same temperature. The demon can measure the speed of individual molecules and 
let through only the fast ones, which would create a temperature difference between the 
two reservoirs without doing any work. As the second law of thermodynamics has stood 
the test of time, it is unlikely that Maxwell’s demon would succeed, but it is fascinat- 
ing to see that molecular-level sensing and manipulation ideas were conceived more than 
150 years ago. 

More recently, in a 1959 lecture titled “There’s plenty of room at the bottom” 
and given to the American Physical Society, the physicist Richard Feynman alluded 
to the possibility of having miniaturized devices, made of a small number of atoms 
and working in compact spaces, for exploiting specific effects unique to their size 
and shape to control synthetic chemical reactions and to produce useful devices or 
substances. 

Historical evidence exists showing that humans have exploited the interaction of light 
with nanoparticles, without understanding the physics behind it. An intriguing example 
is provided by the Lycurgus Cup shown in Figure 1.2. It is thought to have been made by 
Roman craftsmen during the fourth century. The cup contains gold and silver nanoparticles 
embedded in the glass and exhibits a color-changing property that makes its glass take 
on different hues, depending on the light source. It appears jade-green when observed in 
reflected light. However, when light is shone into the cup, it appears translucent-red from 
the outside. The second object in Figure 1.2, a stained glass window at Lancaster Cathedral 
showing Edmund and Thomas of Canterbury, uses trapped gold and silver nanoparticles in 
the glass to generate the ruby-red and deep-yellow colors, respectively. These visual effects 
can be explained using modern theories on plasmon generation, but it is still a puzzle how 


Historical evidence of the use of plasmonic effects. The Lycurgus Cup (left), thought to have been made during the 
fourth century, exhibits a color-changing property. It appears jade-green when looked at in reflected light but 
translucent-red when light is shone into the cup because it is made of glass containing gold and silver nanoparticles. 
The stained glass window (right) at Lancaster Cathedral in which gold and silver nanoparticles were used to generate 
the ruby red and deep yellow colors. (The color version of this figure is available online.) 
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ancient blacksmiths knew the precise material properties and compositions to realize them 
in practice. 

Regardless of the current advances that allow humans to harness the power of nanotech- 
nology, the striking reality is that natural processes have cleverly utilized nanotechnology 
effects for billions of years. Examples include harvesting of solar energy though photo- 
synthesis, accurate replication of the DNA structure, and repair of any damage to DNA 
incurred because of endogenous or exogenous factors. Discovery of such effects unique to 
the nanoscale is the prime task of nanoscience. Theoretical know-how and understanding 
developed through nanoscience is used for nanotechnology that benefits the society through 
its specific applications such as longer lasting tennis balls, more efficient solar cells, 
and cleaner diesel engines. However, from prehistoric times to now, there are numerous 
cases where the use of a technology preceded the underlying science; practitioners were 
unaware of the reasons for peculiar behavior they found in materials and devices much 
different from familiar individual atoms, molecules, and bulk matter, yet proceeded to use 
them in applications — a model that modern engineers and scientists appear to emulate 
even now! 


1.1.2 New Features Appearing at the Nanoscale 


Materials interacting with electromagnetic and other fields exhibit phenomena on a broad 
range of spatial and temporal scales. A basic postulate in physics is the independence 
of any observation with respect to the choice of time, place, and units. It requires that 
physical quantities rescale by the same amount throughout space-time, yet it does not 
imply physics to be scale invariant. It is very clear that, at the smallest scale, physics 
demands a quantized treatment, and Planck’s constant h sets the smallest observable limit. 
The standard model of elementary particles identifies four fundamental forces that gov- 
ern our universe: gravity, the weak force, the strong force, and the electromagnetic force. 
Each of these forces has a specific coupling strength and a specific distance dependence. 
The gravitational and electromagnetic forces scale as 1/r? (called the inverse-square law) 
and they can act over long distances, but the weak and strong forces act only over short 
distances. At distances greater than 107!4 m, the strong force is practically unobservable, 
and the weak force has no influence over distances greater than 10~!8 m. All this sug- 
gests that we need to pay attention to the scale and units used for measuring different 
quantities. 

A common feature of all forces is that they fade away as one moves from the source. 
Quantum field theory explains any force between two objects using exchange particles, 
which are virtual particles emitted from one object (source) and absorbed at the other 
(sink). Four types of exchange particles — photons, gluons, weak bosons, and gravitons — 
give rise to four forces; they all have a spin of | in units of A = h/(27x) and transfer 
momentum between the two interacting objects. The rate at which momentum is exchanged 
is equal to the force created between the two objects. Quantum field theory shows that 
this force weakens as the distance between objects increases. For example, electromag- 
netic force between two charge particles decreases as 1/r*, but this dependence becomes 
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1/r* for dipole-dipole interactions. As the origin of most physical or chemical proper- 
ties can be traced back to the interactions among the atomic or molecular constituents, all 
such properties tend to carry remnants of the inverse-distance dependence and manifest 
as size-dependent features for the nanoscale objects. For example, the following material 
properties become size dependent (to various degrees) when at least one of the dimensions 
goes below 100 nm: 


e Mechanical properties [3]: elastic moduli, adhesion, friction, capillary forces; 
e Thermal properties [4, 5]: melting point, thermal conductivity; 

e Chemical properties [6, 7]: reactivity, catalysis; 

e Electrical properties [8]: quantized conductance, Coulomb blockade; 


Magnetic properties [9]: spin-dependent transport, giant magnetoresistance; 


Optical properties [10]: band structure, band-gap energy, nonlinear response. 


A practical and useful aspect of this size dependence is that it gives engineers the 
ability to tune one or more properties of bulk materials by resizing them to the nano- 
regime (1 nm to 100 nm). It is this feature that lies behind the concept of metamaterials 
— artificially designed materials that allow one to use nanotechnology for practical 
applications. 


1.1.3 Surface-to-Volume Ratio 


In nanosize objects, surface atoms behave somewhat differently from their bulk atoms. A 
simple way to judge whether the surface or bulk effects dominate is to consider the ratio 
of surface area A and volume V of a nanostructure. Table 1.1 compares the ratio A/V for 
three solids in the shape of a sphere, a cube, and a right-square pyramid. It shows that this 
ratio scales as 1/r, where r is a measure of linear size. This scaling is found to hold for all 
regular, simple structures. Even for a complicated structure, if we can identify a single size 
parameter (e.g., by enclosing the structure inside a sphere of radius r), the same scaling 
holds approximately. 

Physically, the 1/r scaling implies that, when the size of a three-dimensional structure 
shrinks, the ratio A/V increases. The drastic effect on surface area can be seen in the 
example shown in Figure 1.3. The cube A has 1m sides with a surface area of 6 m?. If it 
is divided into smaller 1 cm-size cubes (part B), each cube has an area of 6 cm? but there 
are 10° such cubes, resulting in a total surface area of 600 m2. If 1 nm-size cubes are made 
of cube A (part C), the total surface area would become 6000 km. Even though the total 
volume remains the same in all three cases, the collective surface area is greatly increased 
with a reduced size of each cube. 

A drastic increase in the area of surfaces (or interfaces) can lead to entirely new elec- 
tronic and vibrational states associated with each surface. Indeed, surface effects are 
responsible for melting initiated at a surface (premelting) and a lower melting tempera- 
ture of a compact object (compared to its bulk counterpart) [11, 12, 13]. Also, considerable 
variations in the thermal conductivity of nanostructures can be partially attributed to an 
enhanced surface area [14]. For example, thermal conductivity of a nanowire can be much 
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Table 1.1 Surface-to-volume ratio of three regular solids. 


A V A/V 
Cube of side r 6r? P 2 
Sphere of radius r Anr? jar z 
Hahne 2 urs 9/2 
Pyramid with side r 3r 5 m’ z 


PE Dramatic increase in total surface area of nanostructures for a given volume. The cube in part A has 1-m sides, smaller 
cubes in part B have 1-cm sides, and the smallest cubes in part C have 1-nm sides. The total surface area of all cubes 
first increases from 6 m? to 600 m? and then to 6000 km? for the same volume. 


lower than that of the corresponding bulk material [15]; similarly, carbon nanotubes exhibit 
much higher thermal conductivity compared to diamonds [16]. 

As an another example of the surface effects, even though bulk gold is relatively inert 
chemically, it shows high chemical reactivity in the form of a nanosize cluster. This can 
be partially attributed to the abundance of surface atoms in a gold nanocluster that behave 
like individual atoms [17]. Similarly, bulk silver tends not to react with hydrochloric acid. 
However, high reactivity of silver nanoparticles with hydrochloric acid has been observed 
and is attributed to the electronic structure of the surface states [18]. 

In addition to the thermal and chemical properties, mechanical and electrical proper- 
ties are also affected by an increased surface area. For example, indium arsenide (InAs) 
nanowires exhibit a monotonic decrease in mobility as their radius is reduced to below 
10 nm. The low-temperature transport data show clearly that it is surface-roughness scat- 
tering that leads to mobility degradation [19]. Moreover, a reduced coordination of surface 
atoms and the presence of surface charges can exert a relatively high stress that is well 
beyond the elastic regime [20, 21]. Peculiar enough, charges present on the polar sur- 
faces of thin zinc-oxide (ZnO) nanobelts can cause spontaneous formation of rings and 
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coils [22]. A large surface area and the resulting surface effects can also explain the 
occurrence of unexpected phenomena at the nanoscale such as shape-based memory and 
pseudoelasticity. For example, Young’s modulus of films that are thinner than 10 atomic 
layers is found to be 30% smaller than the bulk value. All these observations indicate that 
the enhancement of surface-to-volume ratio plays an important role for nanosize objects. 


1.2 Characteristic Length Scales 
LSS SSS 


To understand the physics at the nanoscale, it is useful to have a clear understanding of 
a few length scales that have fundamental meaning associated with them. Whenever a 
characteristic dimension of a nanosize object becomes comparable to one of these length 
scales, we expect to see effects peculiar to the nanoscale, normally absent from its bulk 
counterpart. If we control the movement of elementary particles (electrons, holes, excitons, 
...) by shaping a nanosize object, the quantum effects appear especially in those regions 
that have dimensions comparable to a specific length scale. 

Quantum mechanics shows that moving particles of matter, whether large or small, can 
be described both as waves and as particles (wave-—particle duality). Early work in this area 
concentrated on demonstrating the wavelike properties of fundamental particles such as 
electrons, protons, and neutrons. More recently, massive particles such as C¢o fullerene (a 
molecule with 60 carbon atoms, size 1.1 nm), tetraphenylporphyrin (a biodye molecule, 
size 2 nm), and CgoFag (a fluorinated buckyball of 108 atoms) have been shown to comply 
with the wave—particle duality description. 

When dealing with nanosize objects, it is often not clear whether a classical or quan- 
tum model should be used to describe their behavior, and there are situations when both 
are needed. One good example is the photoelectric effect of light, for which both wave 
and particle aspects of light are required for a complete description. Another example is 
diffraction of electrons from crystals. The length scale that governs the wave nature of par- 
ticles is known as the de Broglie wavelength, denoted as App and defined in Aside 1.1. This 
length scale plays an important role in nanoscience. We will see later that when motion of 
a nanosize object is restricted to dimensions smaller than its de Broglie wavelength, not 
only quantum-size effects appear but the associated density of states is also modified. 

Further insight could be gained by looking at the analogous situation in optics. The 
geometrical-optics approximation of light does not take into account its wave nature. It 
provides us with a strong clue when we could discard the wave effects. Geometrical optics 
works well for analyzing interaction of light with objects whose physical dimensions are 
larger than the wavelength A of light. For example, if light passes through a slit whose 
width is much larger than à, we can use geometrical optics. However, when the slit width 
becomes comparable to A, the wave nature of light cannot be ignored, and the diffraction 
effects will be pronounced and noticeable. If we replace the wavelength of light by the 
de Broglie wavelength of a particle, the same reasoning can be used to decide when the 
wave nature of the particle becomes relevant. We shall occasionally use the term quantum 
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particle to remind the reader that we intend to use both properties judiciously as needed to 
model certain effects. 


Aside 1.1 de Broglie Wavelength 


The de Broglie wavelength, associated with any moving particle with momentum p, is 
defined as 


ApB = h/p, (1.1) 


where the Planck constant has a value h = 6.626 10734 J-s. It is useful to express p in 
terms of relativistic energy E and mass m of the moving particle: 


E = pc? + ch, (1.2) 
where c is the speed of light in vacuum. This relation implies that App can be written as 


So 13 

0 Tm g 

When a particle is traveling close to the speed of light, its rest-mass energy mc? can be 
discarded compared to E to obtain 


Aps © hc/E. (1.4) 


If the particle is travelling much slower than the speed of light (pc « mc7), we can use 
the approximation E ~ mc? + p*/2m. Such a particle has a kinetic energy given by Ex = 
p*/2m, and its de Broglie wavelength is given by 

h 


h 
ApB © = —, 1.5 
a 2mE, mv a9) 


where v is the particle’s speed. 


The magnitude of App is immeasurably small for macroscopic objects such as humans, 
but it is of the same order (0.1 nm) as chemical bonds for subatomic particles. As its value 
depends on the momentum, App can vary considerably for any particle, depending on its 
mass and speed (or equivalently its kinetic energy). For example, consider a situation where 
electrons have a kinetic energy of about 100 eV, resulting in a de Broglie wavelength of 
about 0.1 nm. This value is comparable to typical spacing between atoms in a crystal. As 
a result, electrons behave as waves and are diffracted by a crystal. However, if the kinetic 
energy of electrons is increased to 100 MeV, App becomes so small that electrons behave 
as particles, showing no diffraction effects in the same crystal. 

Depending on the context, de Broglie wavelength has different names. The thermal 
length àr, for example, is a length scale related to the de Broglie wavelength of a gas 
consisting of noninteracting, slowly moving particles in equilibrium at temperature T. The 
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kinetic energy Ex for such particles is of the order of rkgT, where kg is the Boltzmann 
constant. Using this value in Eq. (1.5), thermal length of this gas is obtained from 


h 
J/2nmkpT 


Consider a gas in thermal equilibrium. The characteristic interparticle distance d for such a 
gas can be estimated from the packing density of cubes of volume d? filling a unit volume, 
giving d = n'/3, where n is the number density of particles. As an example, oxygen at 
room temperature has d around 3 nm, whereas Àr is estimated to be around 0.02 nm from 
Eq. (1.6). Since Av <« d, it follows that this gas does not require a quantum-mechanical 
treatment (except for collisions that do require a quantum treatment). 

As a second example, consider electron—hole pairs inside a semiconductor. Since the 
effective mass of electrons and holes can vary from 1% to 100% of a free electron’s mass, 
the corresponding de Broglie wavelength at 300 K is in the range of 73 nm to 7.3 nm. As the 
temperature approaches 3 K, these values increase by a factor of 10, as seen clearly from 
Eq. (1.6). Apart from temperature, the de Broglie wavelength can also be manipulated in 
semiconductors through impurity doping. The main point to stress is that App of charged 
carriers in semiconductors can vary from 7 nm to 1 um, depending on the experimental 
conditions, and a quantum treatment may be needed in some cases. 

The situation becomes somewhat different for metals, which are characterized by nearly 
free-moving electrons whose density can be controlled thermally, chemically, or optically. 
Metals exhibit high electrical conductivity as well as high thermal conductivity compared 
to other materials. Metals also generally obey the Wiedemann—Franze law at high tem- 
peratures (which states that the ratio of thermal to electrical conductivity is proportional 
to the temperature). Owing to the abundance of electrons with little interaction among 
them, the simplest model assumes that metals are a collection of positive ions in a sea 
of non-interacting electrons and treats metals as a free-electron gas. The positive ions are 
assumed to form a regular lattice and provide a periodic potential in the Schrödinger equa- 
tion. As a consequence of the Bloch theorem, the electron’s wave function is proportional 
to exp(ik - r), where Ak represents the electron’s momentum and r is its position vec- 
tor. The Fermi energy for a free-electron gas can be written as Er = fe ke /2m, where 
m is the effective mass of electrons in the metal and kp is the wave number for this 
energy. 

The energy distribution for a free-electron gas is governed by the Fermi—Dirac dis- 
tribution; it is similar in nature to the Maxwell—Boltzmann distribution of an ideal gas 
(or Planck’s blackbody radiation). The Fermi—Dirac distribution depends on the chemi- 
cal potential u of the material and its temperature, both of which can be experimentally 
measured. The significance of chemical potential is most readily seen in the limiting case 
of T = OK, for which the ground state is obtained by placing all electrons into the lowest 
available energy levels up to the energy Er. As a result of this filling, there is a sharp bound- 
ary in the three-dimensional k-space between the filled and empty states. This surface is 
known as the Fermi surface and its shape is spherical for a free-electron gas. 

The Fermi wavelength A is a length scale defined as the de Broglie wavelength of 
electrons at the Fermi energy Er. If the momentum of electrons at this energy is prp, 


jee (1.6) 
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where p is a unit vector, then Ep = py /2m is the kinetic energy of each electron. 
Thus, the Fermi wavelength is given by 

h h 

ÀF = — = ——. 

Pr /2mEr 
This relation provides good agreement with experimentally observed behavior because 
only electrons in the vicinity of the Fermi surface participate in most physical processes. 
For this reason, the Fermi energy and the shape of the Fermi surface are important in 
practice. Typically, Az for metals is around 0.6nm, and this value can be controlled by 
changing the density of free electrons. This dependence is widely exploited in semicon- 
ductors by changing the type and density of dopants. The Fermi wavelength as a length 
scale plays an important role in both the design and analysis of metal-based nanoscale 
devices. For example, as we will see later, àp is an important parameter for understanding 
quantum conductance of nanostructures. 

Sometimes a characteristic temperature, called Fermi temperature and denoted by Tr, is 
also defined as Tr = Ep /kg. For most metals, its value lies in the range 10* K to 10° K and 
is much higher than room temperature or typical temperatures at which metals are used. 
Thus, under typical operation, temperature changes do not affect the Fermi momentum 
pr much. Suppose the perturbation to pr is given by dp. Then, the dispersion relation 
providing the electron’s energy Ee as a function of its momentum can be written as 

1 a X 2 

Ee = > PFP + ôP) - (PrP + op) = Er + vr - dp + o(p"), (1.8) 
where vp is the Fermi velocity with the magnitude p/m. Since dp is relatively small, we 
retain only the first-order term. This approximation is appropriately called semiclassical 
approximation and it enables us to interpret the electron’s motion analogous to a classical 
particle moving with the Fermi velocity. Because of a relatively large value of the Fermi 
temperature for most metals, they can be described with sufficient accuracy assuming zero- 
temperature conditions. 

Another widely used length scale is the electron’s mean free path le. Its value is cal- 
culated by averaging the distance different electrons travel between two scattering events. 
Clearly, this value is not precise unless the type of scattering (elastic, inelastic, . . .) is spec- 
ified. For example, one can calculate this quantity for elastic scattering, occurring when 
electrons come near impurities or collide with solid boundaries. Noting that only electrons 
close to the Fermi surface participate in such scattering, the average time Te between two 
elastic scattering events, known as the mean collision time, can be used to find the electron 
mean free path as le = Tevp. The collision time Te is also called the relaxation time for 
elastic collisions of electrons. 

However, inelastic collisions of electrons are more prevalent in solids. These can occur 
when electrons interact with other electrons, phonons, plasmons, or change their energy 
through interband transitions. The fraction of electrons subjected to such inelastic pro- 
cesses depends on the energy of electrons. Inelastic scattering affects phase coherence of 
electrons and, if it occurs multiple times, all coherence is lost. In this situation, electrons 
obey a diffusion law with a diffusion coefficient Dje, and it becomes possible to define 
another length scale, the inelastic scattering length, as 


(1.7) 
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lie = y DieTie, (1.9) 


where Tje is the mean time between two inelastic scattering events. 

Coherence plays a central role in all scattering events because it is related to the phase 
of a wave function in quantum mechanics. The degree of coherence is set by the extent 
of correlation between the phases of a wave at different locations (spatial coherence) or 
at different times (temporal coherence). Temporal coherence measures how monochro- 
matic a wave is. For example, if an electromagnetic wave has a spectral bandwidth Av, its 
coherence time is defined as te = 1/Av. In practice, Av is taken to be the full width at 
half maximum (FWHM) of the optical spectrum. However, more precise measures can be 
found in Ref. [23]. The coherence time is the average time over which phase information 
is preserved during propagation. Since light travels a distance ct, during this time, a length 
scale known as the coherence length is defined as 


le = cte = c/ Av. (1.10) 


One may wonder whether the preceding approach used for photons and electrons can 
be extended to phonons, which are responsible for thermal effects. The main quantity of 
interest is temperature, and how it affects the coherence of phonons. By considering the 
coherence of blackbody radiation [24] and analyzing thermal conductivity of superlattices 
[25], it has been deduced that the coherence time of phonons is approximately te = h/kgT. 
This can be understood by relating thermal spreading of energy levels (h/t,) to thermal 
energy kgT. As before, if we combine the coherence time with the diffusion coefficient 
characterizing many scattering events, a new length scale, thermal diffusion length l,, can 
be defined as 


Dieh 
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(1.11) 
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As physical mechanisms responsible for the two length scales, l;e and J;, occur simul- 
taneously, they are not independent of each other. In practice, the smaller of the two 
lengths sets an upper limit on the phase coherence of any particle requiring a quantum 
description. 

Three other length scales become important when we consider the interaction of charged 
particles, such as electrons and holes. Even though these length scales have a semiclassi- 
cal origin, they provide considerable insight, without requiring a full quantum-mechanical 
treatment. The first length scale is the Debye length Ip, defined in Aside 1.2 and appli- 
cable to plasmas such as a free-electron gas. Its physical meaning can be understood as 
follows. Even though charges in a plasma interact with each other, the plasma itself can 
be considered electrically neutral, until an external charge is introduced in the middle 
of it. The electrical potential of such a localized charge falls with distance exponen- 
tially, and the Debye length is equal to the length where it has fallen to 1/e of its initial 
value. The importance of the Debye length in nanotechnology has been discussed in 
Refs. [26, 27]. 
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Aside 1.2 Debye Length 


Consider a plasma of free electrons inside an electrically neutral metal. It differs from a 
normal gas because of Coulomb interactions among its charged particles, even though it 
has a tendency to remain electrically neutral. If the neutrality condition is disturbed by 
introducing an external charge within the plasma, it quickly reacts to smear out devia- 
tions from the charge neutrality. The restoration of charge neutrality happens in a spherical 
volume centered at the external charge, and radius of this sphere is called the Debye length. 


To find the Debye length, let +Q be the magnitude of external charge (assumed to be 
located at the origin). As positive ions are much heavier than electrons, they barely move 
but the charge density of surrounding electrons changes due to their high mobility. In 
thermal equilibrium, the number density of electrons changes from its initial value No 
to Ne(r) = No exp[—qeG(r)/kgT], where G(r) is the electrostatic potential at a distance r 
from the charge location. There are many assumptions built into this expression including 
electron density obeys the Boltzmann distribution, the potential @(r) is isotropic around 
the point test-charge, and qep (r) « kpT. 


We can find ¢(r) by solving the Poisson equation, V7¢(r) = p/e, with the charge density 


P = —qelNe(r) — No]. Considering the spherical symmetry of the situation, we need to 
solve 
ld d 
pet \ a OO, (1.12) 
r? dr dr ly 


where we replaced Ne by the first two terms in its Taylor-series expansion and introduced 
the Debye length as 


(1.13) 


We solve Eq. (1.12) with the substitution ¢ = u/r. The resulting equation, d7u/dr? = u/I?,, 
provides the solution as @(r) = (C/r) exp(—r/Ip), where we retained only the exponen- 
tially decaying part. The constant C can be related to the charge Q by integrating the 
Poisson equation over the volume of a sphere of radius r. Using the result C = Q/(47re), 


we obtain 

Q r 

— = ——)}. 1.14 
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This equation shows that the electrostatic potential ¢(r) decreases faster than 1/r because 
of the finite Debye length. 


The Landau length lz is the mean distance between electrons and ions in a plasma 
for which recombination does not take place. It is defined as the distance from a charge 
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at which the electrostatic energy is equal to the thermal energy kgT, where kg is the 
Boltzmann constant. It follows from Coulomb’s law that 
2 
an (1.15) 
4r egkgT 
where qe is the charge of an electron and €ọ is the vacuum permittivity. 

The final length scale is the Bohr radius ao, defined to be the distance between the 
nucleus and the electron in a hydrogen atom in its ground state. Its derivation uses classical 
physics and equates the centripetal force, mv”/ao, of an electron circling the proton to 
the Coulomb force, È / (47 eoa), between the two particles. It also assumes that angular 
momentum of the electron is quantized such that mvag = h in the ground state of the 
hydrogen atom. It follows that 


Am eoh? 
ao = 


(1.16) 


meq? ` 
The concept of the Bohr radius can be applied to excitons, which are electron-hole pairs 
bound together through the attractive Coulomb force. The role of the proton is played by 
the positively charged hole in this case. 

However, the situation is more complicated for excitons for two reasons. First, the mass 
appearing in Eq. (1.16) may need to be replaced with an effective mass of the electron, 
when an electron orbits a hole. If a hole orbits an electron, me in Eq. (1.16) should be 
replaced with the hole’s effective mass. As the two effective masses are often different, it 
becomes necessary to indicate this difference by using aoe and aop for electrons and holes, 
respectively. 

Second, electrons and holes have comparable masses. As both particles can move within 
the exciton during its finite lifetime, the model used earlier for finding the Bohr radius 
of a hydrogen atom is not applicable. Classical mechanics suggests that their combined 
motion can be analyzed by taking into account the reduced mass of the exciton given by 
1/men = 1/me + 1/mp. Using this value in Eq. (1.16), the Bohr radius of the exciton, 
denoted by dagen, becomes 


4T E0 h2 = A 


meq : Mh a 


adeh = = Ade + Aon. (1.17) 


Excitons are called Frenkel or Wannier types depending on whether the Bohr radius apen is 
smaller or larger than the interatomic spacing of a material. The Bohr model is not a very 
accurate representation of either of these excitons, but it provides physical insight that is 
not easy to gain from a detailed quantum-mechanical analysis. 


1.3 Quantum Confinement 
<LI) 


As mentioned earlier, quantum effects become important when at least one dimension of 
an object becomes comparable to one or more length scales discussed in Section 1.2. One 
of such effects is known as the quantum confinement. A quantum particle can take only 
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discrete energy values. Moreover, its discrete energy levels depend not only on its prop- 
erties (such as its mass and speed) but also on the environment in which it is placed. If 
we restrict the movement of the particle in any dimension, the energies that particle could 
attain are set by the Schrédinger equation, 


ih(d |W) /at) = H|W), (1.18) 


where the Hamiltonian H of the quantum particle is given by 


2 


=-2 4 V(r) = l (-ihV) + V(r). (1.19) 
2mMe 2me 


For example, if free electrons in a metal nanostructure are confined to a region whose 
size is smaller than the Fermi wavelength (de Broglie wavelength of electrons at the Fermi 
energy), electrons can acquire only discrete energy values found by solving Eq. (1.18) with 


H 


a potential V(r) that vanishes inside the region but is infinite outside of it. The appearance 
of +i in Eq. (1.18) and —i in Eq. (1.19) implies a specific sign convention discussed in 
Aside 1.3. 


Aside 1.3 Complex Numbers and Sign of the Imaginary Part 

Although all physical quantities are real under measurements, their mathematical descrip- 
tion often employs complex notation with equations containing either +i or —i, defined as 
the two square roots of —1 [28, 29, 30]. It is obvious that the choice of this sign should 
not change any experimentally observed quantity even though complex quantities are used 
to analyze it. Examples include the complex refractive index of materials and the way 
Schrödinger’s equation is written in Eq. (1.18). 


It is important to choose between +i and —i in a consistent manner and ensure that the same 
choice is made for all equations in this book. We make this choice by specifying the electric 
field of a forward propagating plane along the positive z direction in the form E(z, t) = 
Eo exp(—a@z) cos(Bz — wt + p), where w is the angular frequency, is the propagation 
constant, œ is the attenuation coefficient, and @ is the phase. We represent this wave as a 
phasor using the complex notation and write the electric field as 


E(z,t) = Aexp[i(B + ia)z— iat], A= Ee’, (1.20) 


where A is the complex amplitude and it is understood that the real part of E must be 
taken to obtain the actual electric field. Our notation establishes the rotation direction 
of the phasor as clockwise with increasing time. With this sign convention, the energy 
and momentum operators, ind and —ind respectively, agree with their conventionally 
accepted forms in quantum mechanics. We use this convention throughout the book. 


We also need to make a choice for the sign of the exponential term in the Fourier transform 
used often in this book. Consider an arbitrary electric field E(z, t). We define its Fourier 
transform with respect to both z and ¢ as 


F{E}(k, w) = If. E(z, t) exp(—ikz + iwt) dz dt, (1.21) 
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where k and œw are the Fourier variables corresponding to z and t, respectively. The inverse 
Fourier transform is then defined as 


E(z,t) = (=) f / FLE\k, w) exp(ikz — iwt) dk da, (1.22) 


where the factor of 277 ensures F—'F = 1. It is important to note that opposite signs are 
used for the spatial and temporal Fourier transforms. 


Even though the fundamentals remain the same, the situation becomes more complex for 
semiconductor materials. This is because both holes and electrons coexist inside a semi- 
conductor. We need to consider not only the de Broglie wavelengths of electrons and holes, 
but also their Bohr radii to determine the size at which the quantum-confinement effects 
become important. The smallest of these quantities is ~10 nm in typical semiconductors, 
enabling one to observe quantum effects more easily compared to metallic structures. 

As a concrete example, suppose the critical dimension of interest in a semiconductor 
has a length d,.. Depending on how de compares to the exciton’s Bohr radius apen, three 
different regimes should be considered. In the strongest-confinement regime (de K apeh), 
electrostatic energy resulting from the Coulomb force between an electron and a hole 
is much smaller than quantum-confinement energies. As a result, the electron-hole pair 
forming the exciton can be analyzed independently of its constituents. In the intermediate- 
confinement regime (de ~ den), motions of the electron and the hole are strongly correlated 
through the Coulomb attraction. As a result, energy levels of the exciton depend on both 
the Coulomb interaction and the boundary conditions applicable to the wave function of 
the exciton. In the weak-confinement regime (de >> doen), the motion of the exciton’s center 
of mass determines the energy spectra owing to a much stronger electron—hole Coulomb 
interaction. 


1.3.1 Dimensionality of Nanostructures 


An object may not be nanoscale in all three dimensions. When an object is smaller than 
a critical length scale along a specific dimension, its motion is constrained only along 
that dimension. Conventionally, this dimension is not counted when dimensionality of that 
object is calculated. In other words, dimensionality of an object refers to the dimensions 
in which its motion is not restricted. Clearly, all bulk materials are three dimensional (3D) 
according to this convention used to classify nanoscale structures. 

We can now understand what is meant by zero-dimensional (OD), one-dimensional (1D), 
and two-dimensional (2D) materials [31]. The 2D, 1D, and OD nanostructures are also 
referred to in literature as quantum wells (QWs), quantum wires (QWRs), and quantum 
dots (QDs), respectively. However, caution must be exercised because quantum wells often 
confine electrons only partially, and their behavior may differ from that of a true 2D struc- 
ture. For this reason, QWs, QWRs, and QDs with partial confinement are called Q2D, 
QI1D, and QOD structures, respectively, with Q standing for their quasi-dimensional nature. 
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Often, a higher-dimensional structure can be built from nanostructures of lower dimension- 
ality. Moreover, multiple copies of nanosize structures could be assembled in 1D, 2D, or 3D 
to form composite materials, known as nanostructured materials. An example is provided 
by QWs of two different types stacked on top of each other. If this assembly is periodic, 
the resulting structure is called a superlattice. 

The simplest nanosize device for observing quantum confinement in all three dimen- 
sions is a quantum dot. It can be made using metal oxides (such as ZnO), semiconductor 
materials (such as CdSe), or large molecules (such as fullerene C60). Among these, 
semiconductor QDs have prominence in applications because, when pumped electri- 
cally or optically, they emit light whose wavelength depends on the QD’s size. To 
understand the origin of this size dependence, we review briefly the physics of semi- 
conductors. 

The electronic structure of a bulk semiconductor is characterized through its energy 
bands, characterized mathematically by E,(k) and separated by forbidden energy gaps. 
Here, E,, is the energy of an electron of momentum ñk in the nth band. The valence band is 
the last band full of electrons. The band above it is called the conduction band; it contains 
empty delocalized electronic states. Figure 1.4 shows these two bands, whose parabolic 
shape results from the relation E, = A?k? /(2me). The minimum energy required to transfer 
an electron from the valence band to the conduction band is called the band-gap energy Eg 
of the semiconductor. Whenever a photon having energy Eph > Eg is absorbed, the excited 
electron moves to an empty state in the conduction band, leaving a hole (an unoccupied 
electronic state) in the valence band. Often, the excited electron and its associated hole 
attract each other through the Coulomb force and form an exciton. Thermal effects in bulk 
semiconductors make it difficult to observe such excitons until the temperature is reduced 
considerably below room temperature. 
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1.4 mil A continuum of energy levels in the valence and conduction bands of a bulk semiconductor (left) transforms into 
discrete states in QDs (right). Energy of photons (arrow length) emitted by a QD increases as its size shrinks because 
discrete energy levels of a QD change with its size. 
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If QDs are made using a semiconductor material, electrons can only move within their 
volume. The resulting confinement in all three dimensions modifies electronic states in 
each band such that energy can only take discrete values. A second consequence of this 
confinement is an increase in the binding energy of excitons, enabling one to observe the 
excitonic effects at room temperature and to exploit the nonlinear effects in optoelectronic 
devices. Under suitable conditions, an exciton can be forced to emit light, if its electron 
jumps back to its original state inside the valence band (recombines with the hole), releas- 
ing the energy difference in the form of a photon. Such light emission enables one to 
observe the size dependence of allowed energy levels in QDs. 

As an example, cadmium selenide (CdSe) in its bulk form is a semiconductor with a band 
gap, Eg = hc/d = 1.74 eV, that corresponds to a wavelength à = 690 nm. However, one 
can make CdSe QDs with a size as small as 0.4 nm (containing about 1500 atoms). Such 
QDs are found to emit light at 530 nm when pumped optically [32]. The reason behind this 
change in the wavelength is that the lowest energy level of the QDs has an energy larger 
than that the edge of the conduction band, resulting in a larger photon energy or shorter 
emitted wavelength, when an electron occupying this state combines with the hole. Indeed, 
it is possible to control the emission wavelength by varying the diameter of QDs, as shown 
schematically in Figure 1.4. 

Figure 1.5 shows 36 configurations illustrating how the nanoscale dimensioning of mate- 
rials leads to different arrangements [31]. The notation kDimn ... is used to differentiate 
different structures, where the integer k denotes the dimensionality of the whole nanostruc- 
ture, while integers /,m,n,... denote the dimensionality (0, 1,2 or 3) of its distinct build 
units; the number of integers equals the number of such units. Using this classification 
scheme, it is possible to count the number of possible nanostructures in different dimen- 
sions; 3 of type kD (top row), 9 of type kD/ using one build unit, 19 of type Dim using 
two build units, and 5 of type kDimn using three build units. Adding these, we obtain 36 
(3 +9 + 19 + 5) classes of nanostructures shown in Figure 1.5. 

Even though all energy levels are discrete in 0D QDs made of a semiconductor mate- 
rial, they form mini-bands in 1D (QWRs) and 2D (QWs) structures. These features open 
up many novel opportunities for engineering material properties. Even though metals look 
dull because they lack a band gap of semiconductors, exotic behavior can be seen in metal 
nanoclusters because they exhibit resonant oscillations of electrons at certain optical fre- 
quencies. When such clusters are subjected to an external electromagnetic field, dynamics 
of the electron gas is characterized by the presence of plasmonic oscillations. To the lowest 
order, the linear response of an electron gas is represented by the plasma frequency scal- 
ing with the number density No of electrons as œp « ./No. More specifically, the plasma 
frequency is independent of the size of the object containing the electron gas [33]. 

From a physics standpoint, plasma frequency marks the limiting frequency beyond 
which a metal can no longer screen electric fields. The oscillations arise owing to the 
appearance of a restoring Coulomb force when electrons are displaced from their charge- 
neutral configuration (thus creating a net positive charge). Owing to their inertia, electrons 
do not simply replenish the positive region, but travel further away, thus re-creating an 
excess positive charge. This effect gives rise to coherent oscillations of an electron gas at 
the plasma frequency. The coherence of these oscillations is progressively destroyed by 
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obtain 36 possible classes Ref. [31]. 


collisions of an electron with other electrons or phonons [34, 35, 36]. Although plasmons 
can also decay radiatively (an energy-losing process), this kind of decay is negligible under 
adiabatic conditions. In contrast, radiationless decay occurs mainly via dephasing (known 
as Landau damping). This kind of decay leaves the energy of electrons unchanged but 
destroys coherence of the collective oscillations. 

It is interesting to ask what would happen if one continues to decrease the size of a 
metal cluster to the level of a few nanometers. In this situation, the Fermi wavelength of 
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Increase in the Kubo gap as the size of a metal cluster is reduced from very large (bulk) to a level containing only a few 
atoms. As a result, a metal first behaves as a semiconductor and then as an insulator [37]. 


electrons becomes comparable to the dimensions of the nanostructure, kicking in quantum- 
confinement effects. Even though a large metal structure has closely spaced energy levels, 
Kubo used the free-electron model to show that average spacing between energy levels 
increases as its volume decreases [38, 39]. As a result, a gap — known as the Kubo gap — 
opens up between the Fermi energy (the highest occupied state) and the lowest unoccupied 
state, as shown in Figure 1.6. This Kubo gap, essentially being the average spacing between 
consecutive energy levels in a nanostructure, depends inversely on the number of atoms N, 
in the nanocluster as 
4Er 

~ 3N," 


ôK (1.23) 
When thermal energy kgT of electrons becomes significantly smaller than the energy of 
the Kubo gap, the metallic cluster displays properties different from those expected from 
a metal. As seen in Figure 1.6. a sufficiently small metal cluster shows semiconducting or 
insulator behavior, depending on its size. 

So far, we have considered nanosize objects in which quantum confinement is built into 
the structure during fabrication. It is also possible to change the confinement of a structure 
using external fields. For example, when a strong magnetic field B, is applied in the z 
direction of a bulk metal, energy levels of the electrons change as 


2,2 
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= (n ie 5) hee + (1.24) 
where n is a positive integer, œe = qeBz/me is the cyclotron frequency, and fik; is the 
momentum of electrons in the z direction before the magnetic field is applied [40]. This 
result shows that the electron’s energy has two contributions. The kinetic energy part, par- 
allel to the magnetic field, appears to not change with the magnetic field. However, the 
motion perpendicular to the magnetic field is quantized in multiples of Awe. This situation 
is similar to the discrete energy levels of a Q1D nanostructure considered earlier. If a strong 
magnetic field is applied to a 2D structure (e.g., bilayer graphene [41]), it develops discrete 
energy levels similar to a QOD system. These energy levels are known as Landau levels. 
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1.3.2 Density of States 


In quantum-mechanical terms, a nanostructure is fully described if one knows its all energy 
levels and the associated wave functions, obtained by solving the underlying Schrödinger 
equation. To complicate the situation, the same energy value could be associated with 
more than one wave function. Also, a single wave function may correspond to two or more 
energy values. Such situations are called degeneracies of the system. Even though it is hard 
to find each individual wave function and its energy, applications often require knowledge 
of the number of distinct energy levels and the number of wave functions associated with 
each energy. The remedy is to use the concept of the density of states (DOS), Dos(E). This 
quantity provides the number of available quantum states per differential energy interval 
(dE) per unit volume (area or length for 2D and 1D systems) in the vicinity of energy £. 
The DOS can be viewed as a distribution function for the energy states in a nanostructure. 
We can integrate it over a finite range to find the total number of states in a given energy 
interval. For example, the number of states that fall within the energy range [E;, Ey] in a 


unit volume can be found using the integral f, Pd Dos(E) dE. 
We can use the DOS concept even for a nanostructure with discrete energy levels En 
with n ranging from 1 to N. If its volume is V, the DOS can be written as 


Dos(Z) = i 5 &(E — En). (1.25) 
neN 


The degeneracy of an energy level can be included by multiplying the corresponding 
delta function with its degeneracy value. The validity of this definition is easily seen by 
integrating Eq. (1.25) over the energy range [Ej, Ey]: 


E, Ey 
f Da (E)dE = Ef 8(E — En) dE. (1.26) 
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As the integral reduces to 1 whenever an energy level is within the range [E;, Ef], the sum 
over n simply counts the number of energy level in the interval [E;, Ey]. 

The preceding definition is not convenient in some specific situations. In those cases, it is 
useful to employ an auxiliary function representing the number of states having energy less 
than E in a unit volume (or area in 2D systems). This function is known as the cumulative 
DOS and is defined as CDos(F) = f Dos(E) dE. We can find Dos(E) by differentiating 
CDos(E) with respect to E. 

The main deficiency of the definition of DOS in Eq. (1.25) is that it does not depend 
on the location r within a nanosize particle. It is entirely possible for the wave function to 
vanish or become so small at certain locations that it is unlikely that the particle will ever 
be there. To make the DOS definition meaningful in such cases, the local DOS (LDOS) is 
introduced as 


1 
LDos(£, r) = y > [Pns — En), (1.27) 
neN 


1.3 Quantum Confinement 


where Y,(r) is the wave function for the energy eigenvalue E,,. An illuminating discus- 
sion of LDOS in nanoplasmonic systems in the frequency range dominated by a localized 
surface plasmons can be found in Ref. [42]. 

We should mention that several different definitions for DOS are used in literature, based 
on the context of specific applications. Sometimes, normalization of the DOS in Eq. (1.25) 
by volume is not carried out. This normalization does make sense for large structures such 
as a bulk crystal. However, for small structures, one may simply count the total number 
of states without normalization. In some situations, it makes sense to use a variable other 
than energy as the integrating variable. For example, when the DOS of electromagnetic 
modes in vacuum is calculated for blackbody radiation (Planck’s distribution), energy is 
considered to have the form E = nhw, where n is an integer and w is the frequency of the 
radiation. In this case, it is common to use as the integrating variable and consider the 
DOS a function of frequency such that Dos(E) dE = Dos(@) dw [see Eq. (1.29) in Aside 
1.4]. It is also important to consider other aspects of a quantum state. For example, an 
electron can have its spin up or down according to Pauli’s exclusion principle. Similarly, a 
photon could be in two orthogonal polarization states. The DOS associated with quantum 
states is often multiplied by an occupancy number to find the relevant DOS for the particle 
of interest. 


Aside 1.4 Density of States for Photons 

Consider a rectangular cavity with its sides d1, d2, d3 aligned with the three Cartesian axes. 
Maxwell’s equations show that such a cavity supports plane waves of the form exp(+ik- r), 
where the wave vector k indicates the direction of propagation of a specific plane wave. 
Using the boundary condition that the electric field must vanish at all surfaces of this 
cavity, the three components of k are found to be quantized such that they take discrete 
values k; = nj(27/dj), where nj; is an integer (positive or negative). The frequency of an 
allowed mode is w = ck, where k is the magnitude of the k vector. Noting that each photon 
has an energy E = ha, we want to calculate the DOS as a function of E. 


In the k-space, the volume associated with each state is (27° /V, where V = djd2d3 is the 
volume of the cavity. The number of quantum states having energy less than E = fick per 
unit volume of the cavity, representing the cumulative DOS, is found by calculating how 
many states are contained in a sphere of radius k in the k space: 
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(1.28) 


where the factor of 2 accounts for the possibility that each photon can have two orthogonal 
polarization states. Differentiating the cumulative DOS with respect to E, we obtain the 
DOS for photons: 
E2 wo 
Dos(F) = Te >  Dosl@) = eS (1.29) 


where we used the relation Dos(E)dE = Dos(w)dw with E = hw. This result is for a 
3D cavity. The same argument can be used in fewer dimensions. Noting that the sphere 
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becomes a circle of area zk? in the 2D case and just a line of length k in the 1D case, the 
DOS for photons in these two cases is found to be 


1 
wa as = (1.30) 
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The DOS of a homogeneous structure containing electrons depends on its dispersion 
relation E(k) showing how an electron’s energy depends its momentum ñik. Since this 
relation depends on the dimensionality of that structure, we consider the general dispersion 
relation E = Eg(k), where d denotes the dimensionality of the structure. This relation 
describes a surface of constant energy in the k space. The unit vector normal to this surface 
is related to Vy Eq as 


k1 = VkEa(k)/|VkEp(k)]. (1.31) 


Using it, the perpendicular and parallel components of the vector dk are found to be dk, = 
dk -k 1 and dk = dk — dk, respectively. The differential volume in the d-dimensional k 
space can be written as d'k = d?~'kjdk_. 

We saw in Aside 1.4 that each dimension with length d; has a k-space projection of 
2x /d;. The number of states in the k space per unit volume is given by d4k/(2zr)“, assum- 
ing a d-dimensional hypersphere in the k space of the structure. Differential geometry tells 
us that d4—"ky = 2k4~!74/?/ T (d/2), where T(x) is the gamma function. Using this result, 
the DOS of the structure is given by 


4/2 pari 


Dos(E) = 24-1 74T (d/2) |VeEp(k)|’ 


(1.32) 


where we have not yet included the degeneracy of the energy states. 

As a simple example, consider free electrons inside a conductor in d dimensions large 
enough that no quantum confinement occurs. The dispersion relation in this case is 
parabolic and is given by E = ħ?k?/2me. Considering that each state can be occupied 
by two electrons, we multiply the preceding result by 2 and use k = ./2m,.E/h to obtain 


me\d/2 Ea-2)/2 
z) T(d/2)hT 
Table 1.2 lists the DOS of both free electrons and photons for d = 0,1,2, and 3. It is 


clear that the dimensionality of the structure affects the functional dependence of DOS on 
energy and plays a critical role. 


Da (E) = 2( (1.33) 


Table 1.2 Density of states D, (£) in OD, 1D, 2D, and 3D structures. 


OD 1D 2D 3D 


Electrons 26(E) — 2me/n2h?)!/2E-1/2 me/(h2) sar 2me (hy! E"? 
Photons 0 rhe)! (wh2c2)'E mR) 1g2 
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Parabolic form of the DOS in a bulk semiconductor and changes in its form when electrons are confined in more and 
more directions using QWs, QWRs, and QDs. 


Consider how the DOS change happens when a bulk structure is confined in one or 
more dimensions. Table 1.2 shows that the DOS of a 3D structure varies with E as E'/?, 
Figure 1.7 shows how this DOS is modified from its parabolic form in a bulk medium as 
electrons are confined in one, two, and three directions using first QWs, then QWRs, and 
finally QDs. 

It is interesting to ask what happens when a structure, fully localized in d dimensions, 
is replaced with its quasi-partner where localization is not total. For example, as seen in 
Table 1.2, the DOS of a strictly 1D free-electron gas varies with E as E~'/*. In a quasi-1D 
structure such as a quantum wire, multiple branches display the same E~!/? dependence 
starting at the discrete energy states of the QWR. The results given in Table 1.2 and 
Figure 1.7 are for idealized isotropic systems with relatively simple geometries. In practice, 
many complications arise owing to nonideal conditions such as a finite size of samples, lat- 
tice defects, and surface effects. To include them, one must resort to numerical schemes to 
predict the DOS of such structures. 


1.4 Quantum Interference 
| 


The wave function of a quantum particle governs all of its properties. This wave function 
Wir, t), obtained by solving Eq. (1.18), is a complex quantity. The “Born rule” states that 
|W(r, £|? represents the probability of finding the quantum particle at the location r at 
time t. The phase of the wave function describes relative positions of the peaks and valleys 
of the wave associated with this particle [40]. Although no technique exists that can deter- 
mine the wave function completely [43], the so-called weak measurements can be used to 
measure it approximately [44]. The fundamental idea behind such a method is to measure 
sequentially two complementary variables of the system, while minimizing the disturbance 
induced by the first measurement. Recall that a measurement is carried out by coupling a 
measuring apparatus to the quantum system. 

The weak-measurement technique is based on reducing the coupling between the mea- 
suring apparatus and the quantum system to minimize the disturbance of the quantum state 
[45]. A characteristic feature of a weak measurement is that it does not affect a subsequent 
measurement of the same or another observable in the limit of negligible coupling. If A is 
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the observable of concern and C is the second observable whose measurement gives the 
value c, the “weak value” of A is given by (A)w = (c|A|W) /(c| Y) [44]. Even though 
this strategy compromises precision to some extent, that can be improved by averaging. 
The surprising aspect is that, unlike the standard expectation value (A) = (¥ |A | Y) that 
is always real, the weak value (A) yw can be a complex number. This fact is exploited in 
the weak-measurement technique to find both the real and imaginary parts of the wave 
function. 

To illustrate the versatility of this method, Lundeen et al. [44] consider an example 
where a weak measurement of the position (A = 7, = |x) (x|) is followed by a strong (or 
normal) measurement of momentum, giving the value p. The weak measurement of the 
position is then obtained from 


(7x) w = (plx) (xl) / (pY) . (1.34) 


Noting that (p|x) = exp (ipx/h) and V(x) = (x|W), we see that (7) yw is proportional to 
the wave function V(x) of the particle when p = 0. In other words, at each position x, the 
observed position and momentum values are proportional to the real and imaginary parts 
of (x), and thus they can be used to construct the wave function of a quantum particle. 
Normalization of the wave function is needed to remove the proportionality constant. As 
normalization involves only the magnitude of the wave function, this technique does not 
measure the absolute phase of a wave function. 

We briefly discuss the physical meaning of a wave function’s phase. Consider two quan- 
tum states, |Y,) and |Y), describing the same physical state of a system. This is possible 
if and only if they are linearly dependent (i.e., |Ya) = |W»), where n may be a complex 
number). If |,) is normalized such that (W,|Y,) = 1, the linear relationship between the 
two states requires that |n|? = 1. Thus, 7 is in the form n = e®, where ¢ is a constant 
phase. As the two states represent the same physical state, one can argue that the phase of 
a quantum state has no physical meaning. 

However, it is possible to differentiate the two states by letting them interfere with each 
other to produce an interference pattern. Mathematically, 


(Ya + Yo) (Ya + Yo)) = 2 + (Pal Yo) + (Wel Ya) = 2[1 + cos(2¢,)], (1.35) 


where ¢, = a — Qp is the relative phase of two wave functions normalized such that 
(Wala) = 1 and (Y/Y) = 1. This result shows clearly that it is possible to measure the 
relative phase of two quantum states. The relative phase ¢, does not change if we multiply 
both quantum states by the same phase factor exp(i@), making it clear that the absolute 
phase of a wave function is not a relevant quantity. 

The relative phase of two quantum states can be measured through quantum interference, 
whose effects are analogous to those familiar with classical interference in optics. Indeed, 
Young’s double-slit experiment can be repeated for quantum particles (such as electrons 
or neutrons) to produce similar fringe patterns. However, there are important differences 
between the classical and quantum interferences. The most important one is that the ampli- 
tude of an electromagnetic field varies in space and time through a measurable quantity 
such as an electric field that has a measurable phase, in contrast to that of a quantum wave 
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function. As only the relative phase between two quantum states can be observed, it is 
important to know how it can be controlled in practice for exploitation of quantum effects. 

The mapping ©, (r,t) —> exp(if)W,(r,t) is called a phase transformation in quan- 
tum mechanics. If ¢ is a constant everywhere both in space and time, the transformation 
is referred to as a global phase transformation. However, if ¢ varies with r and t¢, 
the transformation is referred to as a local phase transformation. Although only some 
expectation values of a physical observable can be invariant under local phase transfor- 
mations, all expectation values for the same observable are invariant under global phase 
transformations. 

As our focus in this book is mainly on the interaction of charges with electromagnetic 
fields inside a nanoscale device, it is useful to analyze how the phase of a quantum state 
evolves in an electromagnetic environment. An electromagnetic wave is described by its 
electric field E and magnetic flux density B. The dynamics of these vectors and material 
charges are governed by the Maxwell equations and the Lorentz force equation. As is well 
known in electromagnetic theory, it is useful to introduce the scalar potential V(r,t) and 
the vector potential A(r, £). The electric and magnetic fields can be written in terms of them 
as [30] 

JA 
OL > 
However, the two potentials, V(r, t) and A(r, t), are not uniquely defined. The same values 
of the electric and magnetic fields are obtained if we modify them as 


0g 

ot’ 
where € = &(r, t) is an arbitrary scalar function. As E and B are the real physically measur- 
able quantities, the transformation of the two potentials by an arbitrary € function should 


KE=-VV- B=VxA. (1.36) 


Var, = V'(r, t) — A'r, t) = A(r, t) + VE, (1.37) 


have no observable consequence. Transformations that leave the observable physical quan- 
tities unchanged are known as gauge transformations, and the function £ (r, t) is referred to 
as a gauge function. 

Let us consider how the Hamiltonian of a freely moving electron changes in the presence 
of an electromagnetic field. Because of the Lorentz force acting on the electron, its momen- 
tum p is affected by the electromagnetic field. In the Coulomb gauge, the Hamiltonian H 
given in Eq. (1.19) changes such that [46] 


H = (—iħV — qeA)? + qeV. (1.38) 


2me 
If a gauge transformation is carried out on the fields interacting with the electron, we 
need to use new potentials as indicated in Eq. (1.37). The resulting Hamiltonian has a 
time-dependent term qg,(0&/dt) when V in Eq. (1.38) is replaced with V’. Its presence 
modifies the phase of the wave function as |W’) = |W) exp(iqe&/h). Clearly, the gauge 
transformation of an electromagnetic field results in a local-phase transformation on the 
wave function of the electron. 
The phase of a wave function can also depend on the vector potential. Suppose | Wo) 
is the quantum state in the absence of any electric or magnetic field. If a magnetic field 
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Wave function of a charged particle, moving along the loop C in the presence of a magnetic field B, acquires a phase 
shift that is proportional to the flux passing through the area enclosed by that loop. 


turns on adiabatically (i.e., Hamiltonian changes slowly with time), an electron will follow 
a circular trajectory C, as seen in Figure 1.8. It follows from Eq. (1.38) that the quantum 
state of this electron at any point r along this loop can be written as [46] 


. r 
IYA) = exp (=f A ar) [Wo), (1.39) 
ħ Jro 
where |Yo) is the state in the absence of the magnetic field and ro is an arbitrary reference 
point on the loop C. Clearly, the phase acquired by the electron on completing the loop C 
is given by ¢- = (qe/h) $ A- dr. Using Stoke’s theorem, we can transform this closed-loop 
integral to the following surface integral: 


de de 
c= = QA -dr = > B - ds, 1.40 
b= Epa a= % [fp as (1.40) 


where we used the relation B = V x A and the vector ds is normal to the surface area S 
enclosed by the loop C. This phase is gauge invariant because the surface integral depends 
on the applied magnetic field through B, which does not change regardless of the gauge. 
Note also that this integral does not change if the flux through the loop remains constant 
while the trajectory changes. Even in the extreme case where the magnetic field is shielded 
such that it is finite near the center but vanishes close to electron’s trajectory, the electron 
will still acquire the same phase shift. This phenomenon is exploited in the Aharonov— 
Bohm effect described in Aside 1.5. 


Aside 1.5 Aharonov-Bohm Effect 


Aharonov and Bohm [47] discovered in 1959 that the behavior of a quantum system can 
be altered by a magnetic field in a nonlocal manner (i.e., behavior changes even when 
the magnetic field is zero everywhere in the vicinity of all charges). Figure 1.9 shows a 
setup where an electron beam is split into two, and the two beams are made to interfere 
after taking different paths (yı and y2) around a shielded cylinder containing a magnetic 
field. Suppose the quantum state of an electron just before the beam’s splitting is | Yo). The 
electron must take either the path yı or y2. Based on Eq. (1.39), we can write the quantum 
state of that electron after traversing one of these paths as 


|W,,) = exp <f A-dr)|Wo), — |Wy,) = exp f Aar [Wo). (1.41) 
h yl h v2 
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Illustration of the Aharonov-Bohm effect. An interference pattern is observed when an electron beam takes two 
different paths around a shielded magnetic field. 


Since it is not possible to know which path is taken by an electron, its state (Y) after the 
two beams are combined is a superposition of the two states (i.e., |W) = |W,,) + |Wy,)). 
As aresult, the probability of finding the electron at a location r is given by 


[irl¥)I? = 21 e1¥o)l? [1 + c0s°(@,)], (1.42) 


where ġ, is the relative phase difference. Using Stoke’s theorem, ¢, is found to be 


w=% ha-a= [fp as, (1.43) 
h s 


where the first integral is over the entire loop (after the direction of the yz path is reversed) 
and S is the area enclosed by this loop. 


It is important to realize that, even when the magnetic field is switched off (B = 0), an 
interference pattern forms commensurate with the de Broglie wavelength of electrons. 
The effect of the Aharonov-Bohm phase induced by an external magnetic field is to shift 
that interference pattern by an amount that depends on the magnetic field through ¢,. The 
remarkable feature is that even a shielded magnetic field far from the electron beam affects 
the quantum state of electrons, and thus can be detected through an interferometric mea- 
surement. One may ask how the electron knows that such a magnetic field exists. The 
answer is that the vector potential has a nonzero value near electron’s path, even though 
B itself vanishes owing to magnetic shielding. This has led to philosophical discussions 
whether the vector potential is more fundamental than the field that creates it. 


It has been shown that Eqs. (1.39) and (1.40) can be generalized to nonadiabatic cyclic 
systems [48, 49]. In Figure 1.8, the quantum state |W(f)) is cyclic in time with a period 
t because the electron returns to its original location after each round trip over the loop. 
Because this quantum state accumulates phase as it propagates along the loop, the cyclic 
feature implies that there is a net phase shift of A@ after each round trip such that |W(t)) = 
exp(iA@) |Y (0)). 
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We can introduce a transformation, |Y(t)) = exp[if()] IEO), such that the new state 
does not change after one round trip (i.e., |W(r)) = |W(0))). This is possible if the function 
F(@ satisfies f(t) — f(0) = Ag (modulo 277). With the help of the Schrödinger equation, 
one can show that the phase shift A@ consists of two parts [50]: 


1 T T d ~ 
Ag = -S (YO|A(O)| YO) azi | WOO) dt, (1.44) 
0 0 t 
— l 
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where Aga is called the dynamic phase and Adg is called the geometric phase. The 
latter is responsible for the Aharonov-Bohm effect. As Ad, is independent of the spe- 
cific Hamiltonian H(t) that produces motion along the curve C, it is only related to the 
geometry of the motion and does not depend on the nature of interactions in the quantum 
system. It is also known in literature as Pancharatnam—Berry phase (or just as the Berry 
phase). 

One may wonder whether the scalar potential can also give rise to a geometric phase. 
This is indeed the case, and the associated effect is called the Aharonov—Casher effect 
[51]. It was predicted in 1984 and involves the interaction of a static electric field with a 
moving magnetic dipole. It is commonly identified as a dual effect because moving elec- 
trical charges in the Aharonov—Bohm effect are replaced by moving magnetic dipoles, and 
the vector potential induced by a static magnetic field is replaced by the scalar potential 
induced by static electric field. The reason that a magnetic dipole is needed is related to the 
fundamental absence of magnetic monopoles. 

The original Aharonov-Bohm effect considered electrons propagating in a vacuum. It is 
interesting to ask what would happen if the vacuum path is replaced by a metallic path. As 
electrons in metals diffuse through a lattice consisting of positive ions, it is possible that 
phase coherence may be lost during collisions of an electron with the lattice. Consider first 
the scenario where electrons undergo inelastic scattering such that each collision destroys 
phase information. Owing to the change in energy of electrons when subjected to inelastic 
scattering, the relative phase in Eq. (1.42) acquires a time-dependent term such that 


Ey, 


Ey, — 
fa -dr + (by, — Py) + i t, (1.45) 
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where ġ,, and ¢,, are the phase shifts along the two paths, and Ey, and EF), are the new 
energies of the electrons after inelastic collisions along the paths yı and y2, respectively. As 
this phase oscillates with time at a frequency (Ey, — E,,)/h, it vanishes when averaged over 
time. Thus, the quantum-interference effects cannot be observed in the case of inelastic 
scattering. 

The situation changes if elastic scattering dominates because it not only preserves phase 
coherence but also does not change the energy of scattered electrons. The resulting relative 
phase is similar to that in Eq. (1.45), but without the last time-dependent term. The only 
constraint is that the metal path must be shorter than 10 um, a typical elastic scattering 
length for metals. Clearly, the interference effects can be observed in this situation. Indeed, 
the effect is observable as conductance fluctuations in metals: constructive interference 
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gives high conductance and destructive interference gives low conductance. An experi- 
mental demonstration of this effect was carried out in 1985 using a gold ring (diameter 
8 um, thickness 0.4 um) with a strong shielded magnetic field at its center [52]. A striking 
feature of this experiment was that sometimes electrons circulated two or more times in 
the loop before interference happened. This was deduced from the frequency of oscilla- 
tions that depends on the total path length before interference occurs. This effect is known 
as the Altshuler-Aronov-—Spivak effect in literature and is an interesting example of weak 
localization theory of interfering electrons [53]. Another interesting example of quantum 
interference is related to the persistent currents in small isolated metal rings; it is discussed 
in Aside 1.6. 

We have seen that phase changes in the quantum state of a nanosize object can lead 
to quantum-interference effects, which act as a fingerprint of the quantum character of 
the underlying dynamics. Such effects have no classical counterparts. In addition to the 
phenomena discussed in this section, there are other aspects of quantum interference that 
manifest often in a disguised form. Examples include quantized conductance, ballistic 
transport, quantum Hall effect, universal fluctuations of conductance, Anderson localiza- 
tion in disordered wires, and resonant tunneling. We consider some of these in the next 
section on quantum transport. 


Aside 1.6 Persistent Currents in Small Isolated Metal Rings 


Owing to inelastic scattering of an electron with phonons and other electrons, currents 
in metals eventually cease to exist after all fields are turned off. However, in metal rings 
that are smaller than the electron’s dephasing length — the typical distance an electron 
travels before it and loses its phase information through inelastic scattering [54] — it is 
possible to induce a perpetual current flow simply by threading the center of the ring with 
a magnetic flux [55]. The manifestation of this effect is not only a signature of phase 
coherence of electrons in metals but also an example of the impact of vector potential seen 
in the Aharonov-Bohm effect. 


Consider a metal ring of radius r as shown in Figure 1.10, subjected to a homogeneous 
magnetic field oriented perpendicular to the plane of the ring. The phase shift induced by 
the magnetic field can be calculated with the help of Eq. (1.40). Using the polar coordinates 
(0,0) with the origin at the center of the ring and noting that A points in the 6 direction 
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Illustration of a persistent current in a small metallic ring subjected to a magnetic field. 
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as indicated in Figure 1.10, we can carry out the integration and obtain ġe = (qe/h)®p, 
where ®r = 27FA is the total flux through the ring. 


We seek quantum states that are stationary on the ring, that is, we look for solutions of the 
Schrödinger equation (1.18) in the form Y = y exp(iEt/h), where E is the energy of the 
state. This form leads to the time-independent Schrödinger equation, Hy = Ew. We solve 
it with the Hamiltonian in Eq. (1.38) with V = 0 and A = ®pg/(2x r). Writing this equation 
in polar coordinates and noting that o = r is a constant for the rotating electron, we need 
to solve 


1 (= d qe®, 


2 
an eer 2r) (0) = Ewe). (1.46) 


We look for a plane-wave solution in the form y (0) = Ce? , where C is a constant. When 
we impose the condition yw(@ + 277) = w(@) for the wave function to be periodic on the 
ring, the constant k is quantized and assumes discrete values k, = n, where n can be any 
integer (n = 0,+1,...). For each value of n, we obtain a quantum state y,(@) with the 
energy eigenvalue E, such that 


1 ‘ K de 2 
PaO) = T= explin), Ea = 55 (r FOR) > (1.47) 
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where the constant C was found through normalization. 


To get an estimate of the persistent current, we assume that the temperature of the system 
is close to absolute zero (which was indeed the case in Ref. [55]). Recalling that the energy 
levels are filled up to the Fermi level in metals, the total energy of the system is approx- 
imately E = 2 X, En, where the factor of 2 accounts for the electron’s spin and the sum 
is restricted such that E, < Erp. Noting that E changes if Pg changes with time and using 
Faraday’s law, the current J is related to E as 


jae es La y (n- Sop). (1.48) 


Clearly, the current depends on the number of electrons in the ring and its radius but it 
does not depend on the cross-sectional area of the ring [56, 57]. Even though J is affected 
by defects and other impurities in the metal ring, it may not entirely vanish if such disor- 
der is moderate. Hence, persistent current may exist in rings with a finite resistance. An 
instrument such as an ammeter cannot be used to measure this current, as it requires phase 
coherence of electrons around the entire ring. In experiments, the presence of such currents 
was deduced indirectly by measuring a tiny magnetic moment induced by the circulating 
current. 
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1.5 Quantum Transport 
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Many quantum devices depend on charge transport for their operation. Ohm’s law was the 
cornerstone of charge transport in the twentieth century. It states that the current through a 
conducting material is linearly proportional to the voltage across it (Z = GV); the constant 
of proportionality is called the conductance. 

Historically, charge transport in conductors was systematically modeled by Drude and 
Lorentz. The Drude theory predicts that the conductivity o of a metal is given by o = 
Negets /me, Where the mean collision time Te of electrons is related to their mean free 
path le as Te = le/vr (see Section 1.2). The underlying assumption in this theory is that the 
scattering processes involved are incoherent, resulting in charge transport that is essentially 
diffusive in nature. It is also assumed that different scattering processes (scattering from 
electrons, phonons, impurities, vacancies, and dislocations) are independent. It follows 
from Matthiessen’s rule that their scattering rates can be simply added [58]. 

When the temperature of the metal transporting current is reduced to the extent that 
the quantum nature of electrons dominates over their thermal motion, Matthiessen’s rule 
breaks down. It has also been found that Ohm’s law itself fails when the drift velocity 
of electrons cannot increase indefinitely with increasing electric field and saturates to a 
constant value. It was eventually realized that the assumption that electrons behave like 
classical particles inside a metal leads to wrong quantitative predictions, especially in 
nanostructures. 


1.5.1 Charge Transport in Nanostructures 


According to a semiclassical theory, there are no fundamental restrictions on the conduc- 
tivity of any material, and it can have all continuous values in a range that depends on the 
intrinsic properties of a specific material. However, following the discovery of the inte- 
ger quantum Hall effect, it became clear that conductance is not a continuous variable but 
changes in steps of a basic quantum unit, the so-called conductance quantum given by 


o È 
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Existence of the conductance quantum was first observed in a 2D-electron gas formed 
between GaAs and AlGaAs semiconducting layers [59, 60]. This discovery opened up a 
new area of research on quantum localization. 

The subsequent discovery of the fractional quantum Hall effect propelled research on 
charge transport in new directions. A very useful relationship, due to Landauer (generalized 
later by Biittiker [61, 62]), states that conductance of a quantum material can be calculated 
by multiplying the conductance quantum Gg with the quantum-mechanical transmission 
coefficient of that material. Details of this relation are given in Aside 1.7 (see also Section 
5.2). The important question that seems like a paradox is how dissipation occurs as a result 
of current flowing through a quantum conductor. Whenever a current J flows through a 
material with conductance G, energy is also dissipated at a rate 7? /G. As elastic scattering 


Go (1.49) 
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is the only process responsible for the appearance of the conductance, there is no dynamical 
mechanism that can cause energy dissipation. The answer comes from the appearance of 
a contact resistance, primarily due to mismatch of available conduction channels between 
the quantum conductor and its reservoir (see Aside 5.2). The reasoning is based on the 
fluctuation-dissipation theorem, which predicts the behavior of systems obeying detailed 
balance; electrical resistances in quantum systems are covered by this theorem (see also 
Section 3.3). 


Aside 1.7 MLandauer-—Biittiker Formula 

Consider an ideal quantum conductor connected to two reservoirs at different chemi- 
cal potentials, as shown in Figure 1.11. We assume that this conductor is connected to 
reservoirs through quasi-1D leads. When an electron enters these leads, it immediately 
loses any phase information. Both contacts are assumed to be in thermodynamic equi- 
librium so that the occupation probabilities for electrons are given by the Fermi—Dirac 
distributions, 


E-m 

fro(E — uj) = | 1 + exp G=1,2), (1.50) 
kgT 

where u; and uz are the chemical potentials of reservoirs | and 2, respectively, and T is the 

temperature. If V is the voltage applied between the two contacts, the chemical potentials 

relate to each other through jz) — U2 = GeV. 


We can relate the current to the transmission probability T2ı(E) of an electron of 
energy E as 


2 [00] 
1=2 | T(E) [feo(E — m) — froE — m)] dE. (1.51) 
0 
This equation is known as the Tsu—Esaki equation and is widely used for quantum devices. 
For sufficiently small voltages, we can expand frp(E — uj) in a Taylor series around the 
energy Ep, retaining only the first two terms in the expansion. The term in the square 


brackets inside the integral is then reduced to —qeV(0frfp/dE), where the derivative is 


reservoir 1 reservoir 2 


Schematic of a quantum conductor connected to two reservoirs at different chemical potentials via two leads; current 
flows because of the voltage difference V between the two leads. 
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evaluated at the energy E — Ep. In the limit T — 0, this derivative becomes the delta 
function 6(E — Ef), and the conductance, G = i, of the quantum conductor becomes 


G = 2GoT2\ (EF), (1.52) 


where the factor 2 accounts for the electron spin. This relationship is sometimes called the 
Landauer formula in literature [63, 64]. It can be extended to a quantum conductor with 
multiple terminals, resulting in the so-called Landauer—Biittiker formula [61, 62]. 


In quantum devices, charge transport cannot be studied by counting the flow of elec- 
trons because the classical picture of an electron being a point charge loses its validity. 
For this reason, electric current through a quantum device is quantified by calculating total 
charge transferred through the conductor per unit time. The paradox of electrical current is 
that even though charges are quantized, transferred charge can have practically any value, 
even a fraction of the charge of a single electron (i.e., the electrical current unlike charges 
is not quantized). This can be understood by noting that electric current through a con- 
ductor is merely a displacement of the electron cloud against the lattice of positive cores. 
As this displacement can be by any amount, the associated electrical current is a contin- 
uously varying quantity. To calculate the current, we use the concept of current density 
in quantum mechanics. If the wave function of the electron is W, the current density is 
given by 


| (WiVy — Yvy“) = aeh g [eV], (1.53) 
2Me Me 
where S[...] stands for the imaginary part. 

As an electron moves inside a material, it is scattered from its path through collisions 
(elastic or inelastic) with other electrons, photons, and ions. Three different scenarios 
shown schematically in Figure 1.12 are possible for a conductor of length L and width 
W. If all collisions are elastic such that the phase of W is preserved during each collision, 
charge transport maintains its coherent nature and is classified as being ballistic transport. 
Ballistic transport happens when both L and W are much smaller than the electron’s mean 
free path le. It is important to realize that the mean free path is unrelated to the phase 
coherence of electrons, which is determined by the phase coherence length /g, defined as 
the distance an electron travels before its phase is randomized. This quantity is closely 
related to dephasing time tg, defined as the time after which it is not possible to trace back 


Ballistic Transport Quasi-Ballistic Transport Diffusive Transport 
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Illustration of three charge-transport regimes in a conductor of length L and width W. 
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Quantum Systems Classical Systems 
ballistic when L < le diffusive when L > le 
phase coherent when L < lp incoherent when L >> ly 


the phase information of an electron and determine electrons’ prior locations [54]. Infor- 
mation about several such time and length scales can be found in Ref. [65]. Note that the 
conductance in the case of ballistic transport is not infinite but must be an integer multiple 
of the conductance quantum given in Eq. (1.49). 

In the other extreme where inelastic scattering dominates and phase coherence is not 
preserved after each collision, charge transport is classified as being diffusive transport. 
As seen in Figure |.12(c), diffusive transport happens when channel dimensions are larger 
than le but smaller than the localization length £. The quasi-ballistic transport regime lies 
in between these two regimes. It occurs when scattering from the surface of a nanos- 
tructure is as important as internal scattering. The three regimes in Figure 1.12 can be 
classified by comparing the conductance G of a device with the conductance quantum Gog: 
G < Gg in regime (a), G ~ Gg in regime (b), and G >> Gg in regime (c). In the first 
regime (G < Gg), transport takes place in rare discrete events as a result of single-electron 
tunneling. 

The device length L plays the critical role in determining whether the quantum nature 
of charge transport must be considered for any device. Table 1.3 lists the conditions that 
must be satisfied in the classical and quantum regimes. The mean free path le of electrons 
is the length that divides these two regimes. The second length scale is the phase coherence 
length Jy. Charge transport remains phase coherent when L < /g. On the other hand, if the 
device length L exceeds both le and /g, charge transport becomes incoherent and can be 
treated classically. 


1.5.2 Tunneling in Nanostructures 


Tunneling is an excellent example of the wave nature of quantum particles. Although not 
possible classically, a quantum particle trapped by a potential-energy barrier (owing to it 
not having enough kinetic energy) can still cross the barrier with a finite probability through 
a process referred to as tunneling. The reason behind tunneling is that the particle’s wave 
function decays exponentially inside the barrier. If this barrier is thin enough that amplitude 
of the wave function does not decline to zero at the other side of the barrier, the particle 
has a finite probability of being found on that side. See Aside 1.8 where the operation of a 
tunneling junction is discussed. 


Aside 1.8 Tunneling Junction 
One basic, yet useful device is the tunneling junction, built by placing two electrodes of 
the same or different materials across a barrier layer. As there are many choices for the 
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(a) Schematic of a tunneling junction and (b) the symbol used to represent it. 


electrodes and barrier-layer materials (including semiconductors, magnetic and nonmag- 
netic metals, or superconductors), a wide variety of tunneling junctions exists. All of them 
operate on the same quantum-mechanical principle, shown schematically in Figure 1.13. 
Electrons in each metallic electrode fill the quantum states up to the level of Fermi energy. 
When no voltage is applied, the Fermi levels for two electrodes coincide, and no current 
passes through the tunneling junction. When a bias voltage V is applied, the Fermi levels of 
the two electrodes are separated by an amount qe V. When this voltage exceeds a threshold 
value, electrons can tunnel electrons through the gap, resulting in a finite current. Based on 
quantum theory, the tunneling rate is given by [66] 


-1 
T'7)(AW) = : exp ( =) I (=) ; (1.54) 
de kpT de 


where AW is the change in the free energy of the device because of tunneling and /(V) 
is the current through the barrier at voltage V in the absence of tunneling effects. It can 
be approximated by /(V) = V/Rrz, where Ryy is the resistance of the tunneling barrier. 
Frequently, the approximation AW = T(V; + Vy) is used, where V; and Vy are the voltage 
drops across the barrier before and after the tunneling event, respectively. 


Charge transport in a quantum device is affected by the electric capacitance of that 
device. When such a device is isolated from neighboring devices by a barrier with much 
higher resistance (conductance Gg of the barrier satisfies Gg < Gg), it is possible not only 
to transport quantized charges in such devices but also to study interactions among these 
charges. Such a barrier is normally established by connecting the quantum conductor to 
neighboring devices via a tunneling junction. 

Consider a quantum device containing N electrons and possessing a small capacitance 
Coc. If Es is the stored electrostatic energy in this capacitor, it is easy to conclude that 
it depends on N as Es(N) = (qeNY /2Coc. If another electron is added to this device, 
additional energy needed is 


Es(N + 1) — Es(N) = (2N + 1)Ece, Ece = q,/(2Cgc). (1.55) 


The quantity Ecg is referred to as the charging energy of a quantum conductor, as this is 
the amount of energy needed to charge a neutral quantum conductor with one electron. For 
many nanoscale devices, the capacitance Cgc is so small that this charging energy becomes 
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relatively large. If Ecg >> kgT, thermal energy cannot overcome the energy barrier. As a 
result, a large potential difference must be established across the tunnel junction to trans- 
port charges through a quantum conductor. This process becomes harder as we pump more 
and more charges into the quantum conductor, owing to repulsion experienced by the 
incoming charge from charges stored inside it. The result is that, if applied voltage is not 
sufficiently high, incoming charges cannot find a path to the quantum conductor, a process 
known as Coulomb blockade; it plays a significant role in nanoscale electronic devices. 

Let us consider a simple example to illustrate the concept of Coulomb blockade. The 
electrostatic energy stored in an isolated capacitor is given by Q*/2C, where C is the 
capacitance and Q is the charge on the positive electrode (and —Q is the charge on the 
negative electrode). If an electron tunnels from the negative electrode to the positive one, 
total charge on the positive electrode becomes Q — qe, while total charge on the negative 
electrode becomes —(Q — qe). Thus, energy stored in the capacitor becomes (Q — qe)*/2C. 
As no additional energy was supplied to the capacitor from an external source, such a 
charge transfer should never change the energy of the system. This condition can only be 
met if the initial charge is larger than qe/2, or the initial voltage is greater than qe/2C. 
Thus, if the initial voltage is below qe/2C (but nonzero), electron transfers are forbidden 
for such a system. This is a classic example of the manifestation of the Coulomb-blockade 
phenomenon. 

To observe Coulomb blockade, one should be able to locate electrons inside a quantum 
conductor. However, if electrons can move in and out of this device freely, it is not possible 
to localize them. The remedy is to close off any conductive channel through which elec- 
trons can escape to the surrounding environment (i.e., electrons must tunnel in and out of 
the quantum conductor whenever charge exchanges happen). The Landauer formula given 
in Eq. (1.52) shows that each channel contributes at most 2Gọ to the conductance. Thus, if 
the conductance of the quantum conductor is smaller than this value, electrons must tunnel 
in and out of the conductor. 

The simplest circuit where Coulomb blockade can be observed is the single-electron box 
shown in Figure 1.14. It localizes electrons by allowing their exchange through a quantum- 
conductor island (black circle) only via tunneling [67]. The circuit contains a tunneling 
junction with capacitance Cy ;, which is coupled to a quantum-conductor island, that itself 
is connected to a nontunneling (conventional) capacitor Cg. A voltage Vg is applied to 


quantum conductance island 


-Ayy +q; Ja +, 


Circuit known as the single-electron box and used for observing Coulomb blockade. 


37 


1.5 Quantum Transport 


Cg with the tunneling junction grounded, as shown in Figure 1.14. To observe the single- 
electron effects, the total capacitance of the quantum-conductor island, C7; + Cg, must 
have a charging energy much greater than the thermal energy CA /(Cry + Cg) > kpT). Let 
qg and qr, be charges on the positive plates of the capacitor and the tunneling junction, 
respectively. Under stationary operation, the number of electrons localized in the quantum 
island is found by minimizing the free energy. The free energy Esp is not just the sum of 
electrostatic energies stored in each capacitor but must also contain a term representing the 
energy supplied automatically by the voltage sources when electrons tunnel in and out to 
the quantum-conductor island: 


qy UG 
E =—~ + — —qGVG. 1.56 
SEB = 5 Cry + Co qGVG (1.56) 


Noting that charge conservation on the conductive island demands grj = qc — qeN, we 
can write Esp in the form 


2 2 2 
a cave) IG 
Bees N (1.57) 
ai ea de 2CG 


If the localized number of electrons is fixed, Esgg is a quadratic function of Vg, as 
shown in Figure 1.15(a) for several values of N. As the quantum-conductor island is 
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(a) Free energy Eseg as a function of gate voltage Vg for several values of N in the quantum-conductor island; (b) 
Number of localized electrons (N) versus Vç. 
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charge neutral when Vg = 0, no additional electrons can be added or removed without 
adding energy to the system equal to the charging energy (owing to Coulomb blockade). 
As the voltage Vg is increased, the energy of the configuration N = 0 increases, while 
that of N = 1 decreases, as seen in Figure 1.15(a). When Vg = qe/(2CG), the energies 
of the two configurations are the same, and if Vg is increased further, the configuration 
N = 1 becomes the lowest energy configuration of the system. To attain this energy- 
favored state, one electron will tunnel through the capacitor Crj, bringing the number 
of electrons to one. Thus, as the voltage is swept, quantum states with more and more 
localized electrons become energetically favored, and electrons tunnel through the tunnel- 
ing junction to the quantum-conductor island as needed to attain that state. The staircase 
in Figure 1.15(b) represents the number of electrons in the island as a function of gate 
voltage Vg. 

Even though the single-electron box could deterministically change the number of elec- 
trons in the quantum-conductance island by changing the gate voltage, it is of limited 
use for applications. This is because no simple mechanism exists to count the number 
of electrons inside the island. However, the knowledge and insight gained by the oper- 
ating principle of the single-electron box can be extended to produce the most basic 
single-electron device — the single-electron transistor (SET). The operation of SET is dis- 
cussed in Aside 1.9. One important application of SETs is for sensing a small amount of 
charge [67]. In 2001, an SET-based electrometer exhibited a record sensitivity 3.2 x 1076 
electrons//Hz [68]. Such low values were realized by setting the bias voltages of the 
SET close to the Coulomb blockade voltage so that the current through the SET is 
strongly influenced by the potential of the quantum-conductor island as well as the gate 
potentials. 


Aside 1.9 Single Electron Transistor (SET) 

An SET is built by connecting two tunneling junctions and one conventional capacitor to 
a quantum-conductor island as shown in Figure 1.16. Since such a device has two current 
leads and one voltage lead, the circuit resembles that of a field-effect transistor, but there are 
significant operational differences. The first demonstration of the SET device was carried 
out in 1987 [69]. 
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Schematic of a single-electron transistor. The circuit is similar to a single-electron box but another tunneling junction 
is used to connect to the quantum-conductor island (black circle). 
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To enable a net current through the device, the biasing scheme is designed so that charge 
transfer is possible through both tunneling junctions. The free energy of SET is calculated 
similar to that for a single-electron box and has the form 


a Q;\* 
Eser(N) = GLa Cat Co) (x 2") (1.58) 


where the induced charge Qr = CgVc + CLVL + CrVp has contributions from all voltage 
sources biasing the SET and N is the number of localized electrons in the quantum- 
conductor island. Tunneling of electrons into this island from the tunneling junction with 
capacitance Cy, is possible provided geV_, > [Eser(N + 1) — Eser(N)]. Once there are 
N + 1 charges, it possible to remove one from the tunneling junction with capacitance Cr 
by setting geVr < [Eser(N+ 1)—Eser(N)]. If both these conditions are met, a current will 
flow through the device. This can be achieved by symmetrically biasing the device such 
that Vr = —Vp. 


1.6 Organization and Overview of the Material 
SSSSSSS_====SSSS—_____SSSSSSSSSSS— 7) 


In this introductory chapter we have tried to give a flavor of new phenomena that can 
occur in nanoscale quantum devices. After discussing in Section 1.1 new features that may 
appear at the nanoscale, we introduced in Section 1.2 several important length scales for 
nanosize objects that become important in later chapters. We discussed in Section 1.3 how 
quantum confinement leads to quantization of the electron’s energies and how it affects 
the density of states in nanostructures. We emphasized in Section 1.4 the importance of the 
wave function’s phase leading to quantum interference. We discussed in Section 1.5 charge 
transport mechanisms in naonscale devices and introduced the concept of quantum tunnel- 
ing. The aim was to review the basic but important concepts that will enhance the reader’s 
understanding and help her in understanding the material appearing in later chapters. The 
following is a brief summary of the topics covered in subsequent chapters. 

We review in Chapter 2 the main classical and quantum-mechanical concepts needed for 
understanding the later material. We discuss in two sections the Lagrangian and Hamilto- 
nian formalisms and introduce the important concepts such as a quantum state, eigenvalues 
and eigenstates, and the density operator. The time evolution of a quantum system is dis- 
cussed in the Schrödinger, Heisenberg, and interaction pictures together with perturbation 
theory. Section 2.3 is devoted to the quantization of an electromagnetic field, whereas the 
concept of fermions and bosons is discussed in Section 2.4. 

Chapter 3 is devoted to the linear response of a system. We begin by introducing the 
concept of impulse response in Section 3.1 and then focus on the three ensembles used in 
classical and quantum statistical mechanics for systems in thermal equilibrium. The Kubo 
formula governing the linear response of a system is discussed in Section 3.2 where we also 
introduce the concept of generalized susceptibility. The fluctuation-dissipation theorem is 
derived in Section 3.3 after considering the dynamic correlation function and applied in 
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the context of Johnson—Nyquist noise. The focus of Section 3.4 is on the derivation of the 
dielectric function, and surface plasmons are discussed in Section 3.6. 

Chapter 4 considers how dissipation and decoherence can occur in a quantum device 
through its interaction with its external environment. As an example, we discuss in Sec- 
tion 4.1 spontaneous emission in a two-level atomic system using the Jaynes-Cummings 
Hamiltonian and present details of theory developed by Weisskopf and Wigner. The Maser- 
equation approach is discussed in Section 4.2. The focus of Section 4.3 is on the derivation 
of the Lindblad equation and its applications to a damped harmonic oscillator and a damped 
two-level atom. Section 4.4 is devoted to the derivation of Redfield equation. 

Current flow inside a quantum device is considered in Chapter 5. We discuss the main 
features of quantum transport in Section 5.1 and then use the Landauer-Biittiker method in 
Section 5.2 to drive an expression for the current using an approach based on the concept 
of a scattering matrix. Section 5.3 employs the nonequilibrium Green’s function method to 
calculate the current flowing through a quantum device. 

Chapter 6 focuses on quantum tunneling. After discussing the physics behind tunneling 
in Section 6.1, we present Gamow’s theory of tunneling. The well-known Wentzel- 
Kramers—Brillouin (WKB) method is used in Section 6.2 to develop a quantum theory of 
tunneling. The time dependence of tunneling is analyzed in Section 6.3 using the transfer 
Hamiltonian method. The phenomena of sequential and resonant tunneling are discussed 
in Sections 6.4 and 6.5, respectively. 

The focus of Chapter 7 is on quantum noise. We introduce the basic concepts in Sec- 
tion 7.1 where we also discuss the main sources of noise leading to thermal noise, shot 
noise, amplifier noise, and Brownian motion of a particle. The classical concept of spec- 
tral density is extended to the quantum domain in Section 7.2, where we also calculate the 
quantum spectral density when a harmonic oscillator and a two-level atom are coupled to 
a noise source. We discuss in Section 7.3 the role of quantum Langevin equations by con- 
sidering the quantum theory of a laser. The Langevin formalism is also used in Section 7.4 
to calculate the noise spectra associated with the intensity and phase noise of lasers. 


Quantum-Mechanical Framework 


Quantum mechanics is not a theory about reality, it is a prescription for making the best 
possible predictions about the future if we have certain information about the past. 
G. 't Hooft, Journ. Stat. Phys. 53, 323 (1988) 


2.1 Review of Classical Mechanics 


41 


A mathematical description of a nanoscale device is typically based on the equations of 
motion describing how different parts making up the device change with time [70]. Such a 
description depends heavily on our understanding of the laws governing the device and on 
the approximations adopted in formulating the equations of motion. For example, a single 
device could have different equations of motion, depending on whether we use Newtonian 
mechanics, quantum mechanics, or relativistic mechanics to describe it [71]. So, one may 
ask whether a single set of equations of motion can be used if one agrees in advance on the 
physics describing the device. The answer is clearly no, because the equations of motion 
change with the type of coordinates used for locating various parts of a system relative 
to an agreed reference point (the origin of the coordinate axes). The only requirement for 
such coordinates is that they should be sufficiently unique to identify each and every part 
of a device. 


2.1.1 Generalized Coordinates 


The coordinates used in classical mechanics are referred to as generalized coordinates [72, 
73]. As different choices of generalized coordinates result in different equations of motion 
for the same device, the equations of motion for any device are not unique. The situation 
is further complicated because specific properties of the components, referred to as the 
constitutive relations, are required to thread the physical laws governing the constraints 
restricting motion of the device [74]. As this process of deriving the equations of motion is 
somewhat ad hoc, there is no simple way to predict whether one has obtained a sufficient 
number of independent equations that can be solved to find time variations of generalized 
coordinates of the device. The Lagrangian and Hamiltonian methods avoid most of these 
issues and provide a structured approach to deriving the equations of motion described by 
a set of generalized coordinates [75]. The versatility of the Lagrangian method stems from 
the fact that, regardless of the physical phenomena of concern (related to electromagnetics, 
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quantum mechanics, or Newtonian mechanics), the associated generic technique can be 
readily applied to analyze and learn about any system [76, 77]. 

Science and engineering utilize in practice few standard coordinate systems to represent 
points in two- or three-dimensional (3D) space [72, 73]. For example, coordinates based on 
rectilinear orthogonal axes are referred to as the Cartesian coordinates. Similarly, coordi- 
nates based on angles from a baseline are referred to as polar coordinates. It is conventional 
that coordinates of a specific point appear in a set with a predetermined order [78]. For 
example, the coordinates of a point in a 3D Cartesian system are written as (x, y, z), and the 
polar coordinates in a plane are denoted as (r, 0). However, for the generalized coordinates 
of a system, there is no prescription for the number of coordinates or their order in a set; the 
only restriction is that they must be scalars [79]. This gives us the freedom to be creative 
in inventing coordinates that are not only intuitive but also convenient to use in theory and 
computational work. 

Among many possible generalized coordinates system available for representing a quan- 
tum device, a complete and independent set is preferred [80]. A generalized coordinate 
system is deemed complete if it can locate all parts of a device in all geometrically admis- 
sible configurations at all times. The geometrically admissible configurations are the set 
of configurations that satisfy the geometric constraints, but not necessarily the underlying 
physical principles valid for a particular configuration. A generalized coordinate system is 
deemed independent if, when all but one of them are fixed in value, there remains a contin- 
uous range of values for the unfixed coordinate. If both these conditions are simultaneously 
satisfied by a generalized coordinate system, it is referred to as a complete and independent 
generalized coordinate system. 

When generalized coordinates are used to formulate the equations of motion for a 
device, we need to consider admissible variations, which represent hypothetical infinitesi- 
mal changes subject to geometric constraints of the device (but may not necessarily satisfy 
other physical principles applicable to the situation). The origin of this concept can be 
traced back to virtual displacements in classical mechanics, where infinitesimal changes of 
the system coordinates can occur while time is held constant [81]. It is called virtual rather 
than real since no actual displacement can take place instantaneously (as time is assumed 
frozen during such displacements). However, the importance of this concept lies in that 
it provides a procedural technique for studying the behavior of complex interacting sys- 
tems described by generalized coordinate systems. As we shall see soon, such infinitesimal 
admissible variations are essentially test quantities that reveal interactions among forces 
(or generalized forces) internal to a system. As time is fixed when virtual displacements 
happen, unlike normal (real) displacements, they are defined with respect to a parame- 
ter n enumerating paths of the motion varied in a manner consistent with the geometric 
constraints. The symbol ô is traditionally used as the operator, which acts on generalized 
coordinates to generate variations as [82] 


œ 
lil 


ð 
— ; (2.1) 
an n—>0 
Owing to this definition, the ô operator follows the same rules as the familiar differential 


operator. The size of the set of independent admissible variations is called the degrees 
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of freedom of the system. It is a property of the system and thus has the same value no 
matter which set of generalized coordinates we use (there may be an infinite number of 
ways of doing so). A system is said to be holonomic if the degrees of freedom equals 
the number of generalized coordinates [83]; otherwise, the system is nonholonomic. Aside 
2.1 illustrates an application of this concept to an N-particle system, subjected to several 
geometric constraints, using a 3D Cartesian coordinate system. In particular, it shows that 
admissible virtual displacements preserve the geometric constraints. 


Aside 2.1 Admissible Virtual Displacements with Holonomic Constraints 

Consider a system of N particles with the coordinates (xj, yj, zi) for the ith particle 
(i = 1—N). Suppose the motion of these particles is restricted by K holonomic constraints, 
denoted by the functions $(%1, 1, 21,..-,XN, YN, ZN) = O for j = 1,2,...,K. It is impor- 
tant to note that these constraints are geometric in nature, and no explicit time dependence 
appears in any of them. The holonomic property assures that the constraints can always 
be written using the generalized coordinates. As admissible variations (6x;, dy;, 6zj) are 
infinitesimal values, we can use differential calculus to find the relation [84]: 


a 
— 6x, +... + —dxy = 0; j=1,2,...K. (2.2) 
OXN 


As all displacements in this equation are admissible variations, the resulting configuration 
should still adhere to the same geometric constraints. We can check this by expanding the 
constraint for each j in a Taylor series to the first order as 


= dj dg; 
jx, + 6x1,...,xn + Oxy) = Qx,- XN) + —— 5x1 +... + —— Oxy. (2.3) 
Ox] OXN 


However, all terms on the right side vanish owing to the constraint itself and the relation 
in Eq. (2.2). The same argument holds in the y and z directions. Thus, admissible vir- 
tual displacements satisfy all constraints. This analysis cannot be used for nonholonomic 
constraints because they cannot be explicitly expressed as an equation involving only the 
generalized coordinates [83]. 


The concept of virtual work is fundamental in the Lagrangian and Hamiltonian methods 
because it is directly associated with the energy of a system [81]. It is formally defined 
as the total work done by all nonconservative forces acting on the system as a result 
of an admissible virtual displacement of the entire configuration. As this definition only 
involves nonconservative forces (e.g., friction), conservative forces (such as electrical and 
gravitational forces) must not be considered in virtual-work calculations [85]. 


2.1.2 Lagrangian Formalism 


As current literature on nanoscale devices makes use of topology terminology, we intro- 
duce it on a need basis in this book. Aside 2.2 will help the reader in understanding the 
following material. 
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Consider a quantum device with N generalized coordinates denoted by q1,q2...,9N- 
As each generalized coordinate is a scalar, the entire set represents a configuration vector, 
q=I[a1.---. nl, in the manifold R”. We denote the time derivative of this configuration 
vector (referred to as generalized velocities) by q = [41,....4n]’. This vector lies in the 
tangent space TgRY and is said to be diffeomorphic to RY. Moreover, the combination 
(q, å) lies in the tangent bundle TR”. The Lagrangian £ is a real-valued function of this 
tangent bundle and is defined as [86] 


L(q, å) = Tq, q) — U(q), (2.4) 


where T(q, q) is the kinetic energy function and U(q) is the potential energy function. The 
identification of the two terms as kinetic and potential energies does not always hold, but 
the generic properties derived from a Lagrangian mostly stay the same. 

As the definition of the Lagrangian relies on various forms of energy associated with a 
system, it is often much easier to find it using elementary means. Most importantly, even 
though generalized coordinates are used to label each energy function, being energies, the 
values of T and U do not depend on the actual coordinates used. Thus, such an approach 
is essentially a coordinate-independent description of the system, allowing one to obtain 
equations of motion using nonconventional conditions (e.g., noninertial frames in strong 
gravitational fields). 


Aside2.2 Tangent Vectors, Tangent Spaces, and Tangent Bundles [86, 87] 

A manifold is an abstract mathematical space that locally appears to be an Euclidean space, 
but globally may have a complicated structure. The most familiar manifold we know is 
the surface of the earth; it is locally flat but spherical globally. A manifold can be con- 
structed by joining separate Euclidean spaces together, just like the surface of the earth 
can be depicted by joining maps of local regions together, and accounting for the resulting 
distortions resulting from the spherical nature of the global surface. Taking this analogy 
further, an atlas describes how a manifold is constructed by joining together simpler pieces 
that are represented by their own corresponding charts (also known as the local coordinate 
systems). If the charts are in the Euclidean space (denoted as R^ in N dimensions), the 
resulting manifold is called a topological manifold, and it behaves like an Euclidean space 
(i.e., the two are homeomorphic). 


If the manifold M has additional properties such as differentiability, many familiar opera- 
tions in calculus could be carried on it. For example, it is possible to define a real vector 
space called the tangent space and denoted as T,M, at any point q € M. The tangent space 
contains tangents to all possible curves passing through that point. If one collects all these 
tangent spaces for the manifold M (denoted as |J qeM TM), the resulting structure is called 
the tangent bundle of the differentiable manifold M (denoted as TM). The Lagrangian 
that we discuss in this chapter is a natural energy function defined on the tangent 
bundle. 
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The action integral S(q) is a functional of the generalized coordinates q and velocities 
q and is defined with the aid of the Lagrangian as [86] 


ti 
SQ) = f L(a, 4) at, (2.5) 
to 


where the integration is over a fixed time interval ranging from tọ to tı. As the general- 
ized coordinates define a path in a manifold, we call q(t) a trajectory of the system. The 
action integral is defined for any trajectory taken by the system over the time duration 
covered by the integral. The actual trajectory of the system is selected by the principle of 
stationary action (also referred to as Hamilton’s action principle). It states that the phys- 
ical trajectory of the system must yield a stationary value of the action integral. In other 
words, infinitesimal variations in the integration path q(t) do not produce a correspond- 
ing infinitesimal change in the action integral S(q) when such variations occur around the 
physical trajectory, provided the initial and final configurations are held fixed. 

The Lagrangian formalism is applicable to many branches of physics and can be used 
for systems involving either particles or waves [88]. In this approach, the full dynamics of 
a physical system is contained in its Lagrangian, and equations of motion are derived from 
it by invoking the principle of stationary action. As discussed before, the formalism uses 
virtual displacements in generalized coordinates but succeeds in generating the equations 
of motion applicable to real (measurable) quantities described by the system. However, 
as an extension of this process, it is possible to look at real displacements (as opposed to 
virtual displacements) on the action integral to derive a new set of equations. These are 
referred to in literature as the Jacobi equations. 

Suppose Q(ż) is the physical trajectory of a system. Consider a new trajectory q(t) result- 
ing from an infinitesimal variation, 5q(f), that is, q(t) = Q(t)+6q(z). As the initial and final 
configurations are fixed, dq(to) = dq(t1) = 0. The change ôS in the value of the action 
integral to the first order in ôq is given by 


" (al aL 
5S(Q) = f (sao + Eso) dt, (2.6) 
t \OQ oq 


where the functional derivatives are evaluated on the trajectory Q(t). If we integrate the 
second term in this integral using the “integration by parts” method and use the boundary 
conditions that dq(to) = ôq(t1) = 0, we obtain 


"Tal d (al 
5S(Q) = [ E = (=) dq(t) dt. (2.7) 


We apply the stationary action principle to this equation and demand that 6S(Q) = 0 for 
any arbitrary infinitesimal variation 6q(¢) in the trajectory from its physical trajectory. This 
could only happen if the quantity inside the brackets vanishes. Setting it equal to zero, we 
obtain the well-known Euler-Lagrange equations given by 


ƏL od (dL 

— =—|{-—}. (2.8) 
ðq dt \ oq 

Even though we derived these equations using the Lagrangian in Eq. (2.4), there are 

systems for which Euler-Lagrange equations are satisfied but they do not agree with the 
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definition in Eq. (2.4) (see Aside 2.3). In field theory, it is customary to use the Euler— 
Lagrange equations as the definition of the Lagrangian, which could even be an explicit 
function of time. However, it is important to realize that Euler-Lagrange equations are 
necessary, but not sufficient, for realizing stationary action. 


Aside2.3 A Point Charge Subjected to Electric and Magnetic Fields [89] 

Consider a point charge, with mass m and charge q, moving in an area where both the 
electric (E) and magnetic (B) fields are present. The Lorentz force experienced by the 
particle is given by g(/E+r xB), where r is the velocity of the charge at point r. The equation 
of the motion of the point charge is given by Newton’s second law: mr = q(E +r x B). 
However, as we saw in Section 1.4, the scalar and vector potentials V(r, t) and A(r, t) often 
play an important role. They are related to the E and B fields as indicated in Eq. (1.36). 


As there is no simple strategy for finding the Lagrangian of a given system, we often need 
to guess its form and test it by applying the Euler-Lagrange equation to recover the known 
equation of motion. A suitable form for the charge particle is found to be 


: 1, ; 
L(r, ï, t) = gmi? — qV(r,t) + qi - A(r, t). (2.9) 
The first two terms are in the form of Eq. (2.4) but the third term depends on the vector 


potential. We apply the Euler-Lagrange equation given in Eq. (2.8) with r acting as q. 
Using ae = mr + gA(r, t), we obtain 


d (ƏL ðA 
— | — | = më az t- V)A, 2.10 
a (a) OE Deg EN a0 
where we used the relation an = aA + (t - V)A. The r derivative is found to be 
aL ; : : 
eT —qVV+qV (r-A) =-qVV+q(r-V)A+ar x(V x A), (2.11) 


where we used a well-known vector identity. If we equate these two derivatives and use 
the relations in Eq. (1.36), the equation of motion is found to be mr = q(E + ït x B). This 
confirms that the chosen Lagrangian is appropriate for the charge particle. 


Even though this Lagrangian can reproduce the equation of motion, it cannot be written as 
the difference of the kinetic energy Gmi?) and some potential energy. The reason is that 
U = qV (r, t)— qï - A(r, t) contains a velocity-dependent term. As a result, the vector — V U 
does not represent the force acting on the particle. 


The Lagrangian is not even unique for any system. Consider two Lagrangians, £(q, q, £) 
and L(q, q, £) + 4 f(q, t), where f(q, t) is an arbitrary function that depends on q and ż but 
not on q. It is easy to deduce from the form of the Euler-Lagrange equations (2.8) that 
both lead to the same equations of motion. Moreover, any scaling of the Lagrangian by a 
constant also leads to the same equations of motion. 
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2.1.3 The Hamiltonian 


The concept of momentum is very useful for quantum mechanics. Consider what happens 
when the upper limit t; of the action integral in Eq. (2.5) is replaced with ż, i.e., S(q, 4) = 
Ja L(q, q) dt). Now consider the variation of the action integral when path q(t) is changed 
by ôq. It is easy to show that Eq. (2.7) must be replaced with 


‘Tal od (al al 
5S( =| Eaiealt dt + — - ôq, (2.12) 
. » loa aaa] 4 T aa “4 
The integral vanishes if the Euler-Lagrange equation (2.8) is satisfied, and we obtain the 
simple result 5S = p- ôq, where the generalized momentum (also referred to as the 
canonical momentum [86]) is defined as p = on 


Aside2.4 Legendre Transformation [80, 86] 

The Legendre transformation can be viewed as the conversion of one scalar function into 
another. The transformation is reversible under quite general conditions such that the result- 
ing two functions are the Legendre transform of each other. It is common to call them the 
dual of one another. 


Consider a function A(q1,q2,..., gn) of N variables. We define a new set of variables as 
Pj = aA for i = 1 to N. The Legendre transformation of A is defined as 


N 
Bip, p2,---.PN) = È piqi — Alqi, q2» - - -> qN). (2.13) 
i=1 


Noting that = qi for i = 1 to N, the original function A can be viewed as being the 
Legendre transformation of B, confirming that each is the dual of the other. 


One may argue that B may be a function of both sets of variables p; and q; for i = 1 to N, 
and it is not justified to assume that 6 depends only on p variables. One way to resolve this 
issue is to calculate the total differential of $: 


N N 
aA aA 
dB = > (a + qidpi — A ag) > db = ` [aav + (0 = A) ay - (2.14) 


i=1 i=1 


From the definition of p;, it follows that the coefficient of dqjs is zero for all i = 1 to N. 
Therefore, d5 changes only with variables p1,p2,...,pn, confirming our original asser- 
tion. 


Hamilton used the canonical momenta to define a new functional, now known as the 
Hamiltonian and denoted as H(q, p, t). Aside 2.4 shows how the Legendre transformation 


of a Lagrangian can be used for this purpose. As discussed there, the Hamiltonian can be 


considered a dual of the Lagrangian. We assume that the relations p = af are invertible 
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in the sense that q can be uniquely expressed in terms of p and q. This implies that the 
mapping (q, q) — (q, p) exists, a property of the Lagrangian referred to as hyperregularity. 
Any Lagrangian failing to have this property is called degenerate, and the Lagrangian of 
most physical systems satisfies it. The corresponding Hamiltonian is defined as [90]: 


Aq, p, = p-q— L(q, å, 2). (2.15) 


It is important to realize that two different Lagrangians can produce the same Hamilto- 
nian. As shown in Aside 2.5, we can use the preceding definition to find derivatives of the 
Hamiltonian with respect to p, q, and t. These derivatives lead to the Hamilton’s equations 


in the form 
0H oH 

Y= — Dp = ——. 2.16 

q op P aq (2.16) 
Hamilton’s equations and the Euler-Lagrange equations are equivalent because it is pos- 
sible to derive the latter from Hamilton’s equations and to show that ae = p and 
L t 
ðq T P 


Aside 2.5 Partial Derivatives of the Hamiltonian 

Definition of the Lagrangian assumes that the variables q, q, and ¢ are independent. Simi- 
larly, the variables q, p, and ¢ used in the Hamiltonian are assumed to be independent. This 
implies that partial derivatives such as oq are zero. We can use this feature to calculate the 
partial derivatives of the Hamiltonian H(q, p, t). Consider aa 


oH aq ð dq 0 ot ð . ƏL 
=P: : ( oe Jeaan= -4 (2.17) 
oq əq ðq əðqəðå əðqət oq 
Application of the Euler-Lagrange equation then yields the simple result 
oH d (ƏL 
= ( ) =-—p. (2.18) 
oq dt \ dq 
Similarly, 
oH, oq dqol . 
“H jip ee ey (2.19) 
dp dp dp oq 


(2.20) 


mers 


where we used Eqs. (2.18) and (2.19) to cancel the first two terms. 


The chief advantage of the Hamilton’s equations approach is that it enables us to find 
the Hamiltonian with a knowledge of the kinetic and potential energies (see Aside 2.6). 
As there is no systematic way of guessing the Lagrangian, this feature is quite useful in 
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practice. Compared to the Lagrangian formalism, the Hamiltonian formalism also provides 
a simpler computational path. The reason is that the Hamiltonian equations (2.16) consti- 
tute a set of 2N first-order, ordinary differential equations for a system of N particles. In 
contrast, the Euler-Lagrange equations (2.8) constitute a set of N second-order, ordinary 
differential equations. We shall see later that Hamiltonian formalism also provides an easy 
path to set up quantum-mechanical equations for a given system. 


Aside 2.6 Hamiltonian for a System of N Interacting Particles 


We use the definition in Eq. (2.4) to write the Lagrangian in the matrix form 


1 
L(q, 4) = 54 Maa — U(q), (2.21) 


where the first term represents the kinetic energy of all moving particles and the second 
term is their potential energy. The matrix M(q) contains masses of all particles and is 
invertible for each value of q [91]. 
To find the Hamiltonian, we first find the momenta of all particles using 
ð : ; 

p= F q) = M(qq. (2.22) 
Multiplying this equation with M~!, we obtain q = M~'p, where M7! is the inverse 
of matrix M. The Hamiltonian is easily found using the definition in Eq. (2.15) and is 
given by 


1 
H(q,p) = zP (M~')'p + U(q). (2.23) 


As expected, the Hamiltonian is the sum of the kinetic and potential energies of all inter- 
acting particles and represents the total energy of the system. Because the Hamiltonian has 
no explicit time dependence, the total energy of the system is conserved on the phase-space 
trajectory for which Hamilton’s action principle is satisfied. 


The Hamiltonian can be used to express the time evolution of any physical quantity 
associated with the system. Consider a generic function f(q, p, £) that depends on both q 
and p in addition to time t. The rate of change of f (q, p, £) is found to be 


df (3g .of af 
f= (H+at eee). (2.24) 


Substitution of the Hamiltonian equations (2.16) results in 


df _ uw, (am af A af 


= H}, 2.25 
dt ðt ðq Op dp ðq ot ee ( ) 


where we introduced the Poisson bracket {f, H} that plays a central role in both classical 
and quantum mechanics [73]. The functions f and H may depend on all variables in the set 
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{q, p} or on a subset of these. A few important Poisson brackets that have found extensive 
use in quantum mechanics and can be derived easily are: 


{qiq} =9, {pi,pj}=9, {qi pj} = Sy, 
of of 


fq} =-->, fpi = Jg: 


, (2.26) 
Op; qj 


If the function of interest does not depend on time explicitly (df/dt = 0), the Poisson 

bracket provides a compact way to write the equation of motion as 

df 

Ti {f, H}. (2.27) 
An insightful application of this result is to write the Hamilton’s equation of motion as 
q = {q, H} and p = {p, H}. 

Poisson brackets provide us with a simple way to find the conserved quantities for a 
system. Conserved quantities retain their values as the system evolves with time. As both 
the partial and total time derivatives of f vanish when f is conserved, we find from Eq. 
(2.25) that this is possible only when the Poisson bracket {f, H} itself is zero. Another 
remarkable feature is that Poisson brackets can be used to create new conserved quantities 
by combining two known conserved quantities. This is sometimes referred to as the Pois- 
son theorem. Suppose fı and fọ are two conserved quantities in a dynamical system (i.e., 
{f1 H} = {f2, H} = 0). If we use the Jacobi identity (see Aside 2.7), we have the relation 


{H, (fi, fat} + fi th, HH + (f(A, fih = 0. (2.28) 


It follows immediately that {H, {f1, fo}} = 0 (ie., the new quantity {f1, f2} is also a con- 
served quantity). However, caution should be exercised because this process does not 
always provide new conserved quantities for a finite-dimensional system. It is known that, 
for a mechanical system in N dimensions, the total number of conserved quantities is lim- 
ited to 2N — 1. If we know all of these conserved quantities, the operation {f1, f2} on any 
two of them will generate trivial constants, or functions that are simple extensions of the 
original functions fı and f2. 


Aside2.7 Key Properties of Poisson Brackets [92] 


The definition of the Poisson bracket can be used to find many properties of such brackets. 
A trivial property is that{f, g} is equal to zero if the functions f and g do not depend on the 
phase-space variables used for defining the Poisson bracket. Other useful properties are: 
antisymmetry: {f, g} = —{g, f} 

linearity: {f + g, h} = {f, h} + {g, h} 

partial derivative: 2f, gh = (x, 9} + ff, aay 

Leibniz’s relation: {fg,h} = f{g,h} + eff, h} 

Jacobi identity: {f,{g, h} + {g,{h, FH} + {h, (f, gh} = 0. 


2.2 Fundamentals of Quantum Mechanics 


The total time derivative of the action is related to the Hamiltonian as 


dS _ 


a L=—H(q,p,t)+ pġ. (2.29) 


This equation gives us the differential of the action in the form 


os as 
dS = —H(q,p, dt + p - dq = —dt+ — - dq. (2.30) 
or oq 
The preceding identity provides us with the relations H = — 3s and p = as We can use 
them to show that the system dynamics is governed by the following first-order partial 
differential equation: 
as os 
—+H\|q,—.,t]=0, 2.31 
= (a ii ) (2.31) 


This equation is known as the Hamilton—Jacobi equation [93] and is equivalent to the 
Hamilton’s equations of motion in Eq. (2.16). For an N-dimensional system, Hamilto- 
nian equations form a set of 2N first-order, ordinary differential equations, whereas the 
Hamilton-Jacobi equation is a single partial-differential equation in N + 1 dimensions. 

It should be clear by now that the action S is the central quantity governing a system’s 
motion in the phase space (q, p). The conserved quantities (or invariants) are pivotal to this 
motion because they shape the trajectory as it evolves. It is then legitimate to ask whether 
a direct connection exists between the action and the invariants of a system. The answer 
is affirmative, as an invariant exists that is directly related to the action. It is the action 
on a closed contour in phase space. Using Eq. (2.30), we can write the following contour 
integral over a closed contour y in the phase space: 


f dS = f [p - dq — H(q, p, ®©] dt. (2.32) 
yY Y 


It can be shown that this integral is invariant for any choice of the contour y; it is known as 
the Poincaré invariant. It exists for all dynamical systems and is a universal invariant [94]. 
However, the practical value of the Poincaré invariant is limited because its calculation 
requires knowledge of the phase-space trajectory, which can be found only by solving the 
Hamilton equations or the Hamilton-Jacobi equation. 


2.2 Fundamentals of Quantum Mechanics 
E) 


The theoretical framework of quantum mechanics is built upon a small number of postu- 
lates based on the concept of a linear, unitary, vector space known as the Hilbert space 
[95, 96, 97]. In this section we use these postulates to introduce the physical states of a 
quantum system. 
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2.2.1 Concept of a State Vector 


QM Postulate 1 A quantum system is completely described by its state vector, which is a 
unit vector in the Hilbert space. 


However, this postulate does not specify properties of the Hilbert space (e.g., its dimen- 
sion) or the state vector that represents the physical system. As we will see later, different 
branches of physics (such as QED) have emerged to address this issue [76, 98]). It is com- 
mon to denote a state vector in the Hilbert space as |W) in the bra-ket notation of Dirac [99]. 
We can represent this state in any coordinate system. In a coordinate system based on the 
generalized coordinates q, the state vector |W) specifies a function, called the wave func- 
tion and written as Y(q) = (q|). In other words, the wave function is the projection of the 
state vector in a specific coordinate basis. For this reason, the terms state vector, quantum 
state, and wave function are often used interchangeably. In general, the wave function is 
a complex quantity at any point in the phase space and it is a continuous function of both 
time and q that is also differentiable. 

As seen in Section 1.4, the states |W) and e? |W) represent the same physical system 
for any real value of 6. This is why sometimes physical states are identified as rays in the 
associated Hilbert space. Also, the superposition principle holds (i.e., any linear combi- 
nation of state vectors is also a state vector). This is a consequence of the linear nature 
of the associated Hilbert space. The phenomenon of quantum interference discussed in 
Section 1.4 is a direct consequence of the superposition principle. However, care must be 
taken not to introduce any inadvertent errors. As an example, consider the state vector 
|W) = a; |W) + a2 |W2) that is a linear combination of the state vectors |W 1) and |W). 
Here, a; and a2 are two complex numbers. If the phases of these two states change so that 
they become e’! |W) and e!” |W), clearly each one of them represents the same physical 
system. However, the superposed state, |W’) = ae!! |W,) + are! |W), is different from 
the original state |W). It is important to realize that even though global phase factors on 
state vectors can be ignored, relative phases of superposed states must be accounted for 
when representing quantum states. 

The quantum states of two or more parts of a system can be combined to represent a 
composite state of the whole system. Suppose Hı and H3 are two independent Hilbert 
spaces of dimensions N and M, respectively. The composite system is denoted as H1 ® H2 
and is referred to as the tensor product of its two independent parts with the dimension 
N x M. Analogously, any state vector in the composite system is represented as |Y1) @|W2) 
where |W1) € Hı and |W2) € H2. The compact notation, |1) |W), is also used for 
|W) @ |W). Aside 2.8 provides more details about the tensor products. A local operator 
O] acting on Hı corresponds to QO; ® h in the composite system. Similarly, O2 acting 
on H2 becomes J; ® O2, where 1; and h are identity matrices of dimensions N and M, 
respectively. Thus, the composite system provides a unified way to carry out operations 
specific to each Hilbert space. 

A nice application of the tensor product naturally emerges when we consider a collection 
of quantum particles. If the quantum state of the jth particle belongs to the Hilbert space 
Hj, the composite space of N particles can be represented by the tensor product Hı @ 
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H2... 8 Hy. Here we have used the property that a tensor product does not depend on the 
order, and no bracketing of terms is needed. This notation can be further simplified if all 
particles have the same Hilbert space. The tensor product is then written as HS”. 

The preceding representation breaks down when particles are indistinguishable, and 
one has to consider the permutation symmetry. This can be clearly seen by looking at a 
two-particle system in which two permutations of the particles’ states, |Y1) @ |W2) and 
|W) @ |W1), describe the same two-particle configuration. Sometimes this is attributed to 
the situation that one is not able to track individual particles without disturbing the state of 
the system. If a permutation of the particles is irrelevant for the combined system, its state 
can only change by a numerical factor such that |W’) = P|W). Since a second permutation 
must return the combined system to its original state, we require P? = 1 or P = +1. The 
choice of P = 1 or P = —1 leads to two classes of particles: bosons are particles for 
which P = 1 and they obey the Bose-Einstein statistics. In contrast, P = —1 for fermions 
obeying the Fermi—Dirac statistics [100]. One can also say that such quantum states must 
follow either Bose-Einstein or Fermi—Dirac statistics. This is commonly referred to as the 
Bose—Fermi alternative or symmetrization postulate. 

The combined state of two indistinguishable bosons or fermions can be obtained by 
symmetrizing or antisymmetrizing the tensor product |) @ |W2) such that 


= 
(2 


where the factor of 1 /./2 ensures that the combined state is a unit vector in its Hilbert space 
(i.e., it is properly normalized). It is easy to see that |), does not change sign when we 
interchange 1 and 2, while |W), does change its sign. We can generalize our notation H®” 
for a system of N identical particles by adding the subscripts “S” and “A”. Thus, (H®")s5 
and (H®), denote the N-particle states that have been symmetrized or antisymmetrized 
appropriately. Other alternatives are theoretically possible and are known as parastatistics 
[101]. 

Even if one is ready to accept the Bose—Fermi alternative, it is not possible to make the 
correct choice of the statistics merely based on mathematical consistency. Rather, we note 
that this freedom of choice only exists in nonrelativistic quantum mechanics. In relativistic 
quantum mechanics, the Bose-Einstein statistics applies to particles with integer spin, and 
the Fermi—Dirac statistics applies to particles with half-integer spin. The opposite choice 
leads to violations of the axiomatic principles [102]. 


IY)s.4 = | Y1) @ M2) +142) @ 11) |, (2.33) 


Aside2.8 Properties of Tensor Products [103, 104] 

A tensor product can be used to create a new Hilbert space out of two or more Hilbert 
spaces. Let Hy and Ho be two Hilbert spaces with the states |W,,) and |®,,). The tensor 
product Hy ®He has elements of the form }-,, |Win) ®@|Pm). The tensor product |Y) @|®) 
obeys the following properties: 


e c(l¥)@ a = (c |¥)) 8 |Ð) = |Y) @(c|®)), ce C 


e (|W1) + |W2)) 8 |Ð) 1) @ |®) + |¥2) @ |P) 
e |W) 8 (P1) + |®2)) ) 8 |1) + |¥) 8 |®2) 
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e Inner product: (¥1| 8 (®1| |W2) 8 |@2) = (|W) (P| Hy) 

e Tensor product of operators: Ow ® Oo(|V) ® |®)) = Ow |W) 8 Oo |®). 
Using these properties, it is possible to combine orthonormal state vectors in each space 
to construct an orthonormal basis for the tensor-product space. In such a construction, 
sometimes the notation |W @ ®) is used in place of |W) &® |®) for simplicity. 


It turns out that the partition of a quantum system into two or more subsystems is not 
always possible. This leads to the concept of quantum entanglement (or quantum correla- 
tion) [105, 106]. Suppose we have a quantum state |W) for a quantum system described 
by the Hilbert space H composed of two Hilbert spaces Hı and H2 (H is also referred 
to as a bipartite Hilbert space). If |W) cannot be written as |W) = |W), @ |W), then H 
is called entangled with respect to Hı and H2. It is natural to think that this concept is 
related to classical correlations, and entanglement also occurs for classical particles. How- 
ever, there are some fundamental differences. Classical correlations can often be traced 
back to some conservation law. For example, a particle can decay into two particles that 
fly apart in opposite directions with speeds that comply with the momentum conservation. 
Thus, knowing the momentum of one particle is sufficient to predict the momentum of the 
other. The momentum of each particle exists regardless of a measurement performed on the 
other, and it can be predicted by applying the conservation of total momentum. However, 
such a technique cannot be used for quantum entangled states, as illustrated in Aside 2.9. 


Aside2.9 Entanglement through Bell States [107, 108] 

Consider two spin-5 particles. We represent the quantum state of each particle in a two- 
dimensional Hilbert space using the traditional “up” and “down” spin orientations, |); and 
|4}; where j = 1 and 2 for the two particles. The four Bell states for the composite system 
of two particles are denoted as 

pE 1 d wr 1 

|o*} = ve (tyr to I)i )2) and |W*) = Ra 
Clearly none of them can be written as a product of single-particle spin states in the form 
l1) |€o). Both [pt and |y) are defined as a superposition of the two single-particle 
product states with a well-defined phase relation between them. For example, y+) shows 
that if the spin of the first particle points up, the spin of the second one points down (and 
vice versa). More generally, states of the form |y) = a |1 |t)2 +b I4) I4) are entangled 
for any two complex numbers a and b such that |a|? + |b|? = 1. 


AMi NENN). (2.34) 


— 


To further clarify the entangled state, consider a pair of spin-4 particles described by the 
Bell state |wr) as defined above. If we measure the state of one particle and find it in a spe- 
cific state, say |+),, we can immediately infer that the second particle, upon measurement, 
would always be found in the state |} )2. This is referred to as quantum entanglement and 
it appears similar to classical correlations. The crux lies in the fact that, unlike the classical 
case, a quantum state can only provide a probabilistic outcome with no definite answer. A 
measurement must be made for a definite answer. 
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It appears from the preceding discussion that information has been exchanged instanta- 
neously between the two particles, as soon as the measurement is performed, regardless 
of the distance between them. This observation is the basis of the Einstein—Podolsky— 
Rosen (EPR) paradox [109]. According to this paradox, exploiting entanglement, an 
observer can make measurements on system A to make precise statements about sys- 
tem B that may be located very far away. It tries to answer the question whether a 
quantum-mechanical description of physical reality can be considered complete. We con- 
sider a theory to be complete if every element of physical reality can be mapped uniquely 
to it. That means we should observe exactly what is happening in the physical world, 
and the process of measuring should not distort it (i.e., measurement outcomes must 
be independent of the measurement process). In 1964, Bell investigated the EPR con- 
clusion that the quantum description of physical reality is not complete by using it as 
a working hypothesis and quantified the EPR idea of a deterministic world [108, 110]. 
In a classical world, we would expect that measurement outcomes are independent of 
the measurement process, and the results obtained at one location are independent of 
any actions performed at distances where information cannot be exchanged even at 
the speed of light. Recent experiments show that quantum mechanics does properly 
predict the results of experiments that violate the EPR criteria of reality and locality 
[111, 112]. 


2.2.2 Eigenvalues and Eigenstates 


QM Postulate 2 Every measurable physical quantity is associated with a Hermitian 
operator and is observable only through a measurement of this operator’s eigenvalues. 
When a certain eigenvalue is observed, the act of measurement changes (or collapses) the 
quantum state to the corresponding eigenstate of this operator. 


This postulate assigns a Hermitian operator to every measurement on a quantum system. 
Suppose the eigenstates of the operator O are given by the set |¢;) with the eigenvalues oj 
such that Ô |¢;) = 0; |¢;)), where j labels different states. For Hermitian operators (Ô = 
Ôt), the eigenvalues are always real, and j either takes discrete values or falls in a range of 
continuous values. 

Discrete eigenspectra: When a quantum system is in the state |W) (normalized such 
that (Y|W) = 1), the probability Pr(o;) of measuring one of the nondegenerate eigenvalues 
o; of the operator is given by 


Pr(oj) = | (@l¥) |? = loj. (2.35) 


Here 0; = (¢;|¥) is the component of |W) when projected onto the eigenstate vector |¢j). 
When eigenvalues are distinct and nondegenerate, the corresponding eigenstates form an 
orthonormal set. The situation changes if a specific eigenvalue oj is k-fold degenerate. In 
this case, as there are k eigenstates giving the same eigenvalue oj, we need to account for 
all of them when its probability is calculated. It turns out that one can always find linear 
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combinations of the k eigenvectors that form an orthonormal set. If these vectors are given 
by I$”), where i = 1,2,...,k, the probability of measuring oj becomes 


k 
Pr(oj) = X | (P19) 1P. (2.36) 
i=1 
As many different orthonormal sets can exist for a given degenerate eigenvalue, one 
may wonder whether different choices would lead to different probabilities. However, it 
is possible to show that the probability is independent of the choice of the orthonormal 
set used to calculate it, as it should be on the physical grounds. Thus, if an operator is 
already in an eigenstate, then a measurement of that operator yields with certainty the 
corresponding eigenvalue. However, there is no prescription as to which eigenvalue of an 
observable operator would be observed with certainty. Instead, one can only specify the 
probability of observing that eigenvalue. 
Continuous eigenspectra: If the eigenvalues 0, of the operator Ô are continuous with 
the eigenfunctions |o,), we define the probability d Pr(o,) that the measurement yields a 
value between o; and oc + do, as 


dPr(0c) = | (0c|W) |? doe. (2.37) 


It is also possible that an operator has some eigenvalues discrete and some eigenvalues 
continuous. This case can be handled by noting that both the discrete and continuous eigen- 
states can be recast in such a way that they form a basis for representing the operator. Such 
cases will be considered when we look at device models in later chapters. 

QM Postulate 2 forms the basis for measuring a quantum system [113]. Quantum mea- 
surements on any system are performed by a collection of operators: Om with m = 1 
to M. These operators act on the Hilbert space associated with a specific system. As all 
measurements constitute a complete set of observable values, the underlying operators sat- 
isfy the completeness relation } „em O} Om = I. The situation before a measurement is 
called the preparation. If the system is prepared in the state |), then the probability of 
measuring the value for ôn is given by Pr(m) = (Y |O} Om |W). Immediately after the mea- 
surement, the quantum state will change to a postmeasurement state |Ypm) = NOm|¥), 
where 7 is a complex number. The normalization condition, (Ypm|WYpm) = 1, requires 
(WO! On|) |n|> = 1. As we have seen before, global phase factors can be safely 
ignored. Thus, 7 can be specified using its magnitude. Thus, the prepared state |W) 
collapses to a postmeasurement state given by 


(Yom) = |n|On |W) = 


Om |W) 
TE (2.38) 


(VO) OnlY 


Two special types of measurements are widely used [114]: projective measurements 
(PM) and positive operator-valued measurements (POVM). The projective measurements 
rely on the eigenvalues of the operator to construct projective measurement operators. 
Without loss of generality, consider an operator O with nondegenerate discrete eigenstate 
|Om), corresponding to the eigenvalue Oom. As the set of these eigenstates forms a com- 
plete orthonormal basis for the operator, we can represent the observable operator using its 
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eigenstates and eigenvalues as O = }_,ņ„ Om |Om) (Om|. The projection operator for the mth 
eigenvalue is given by |0m} (Om|. This means the observable operator is represented using 
projection operators, which form the basis for this measurement method. There are some 
nice properties of this representation. For example, we can calculate the average observable 
value of the operator O with the knowledge that only one of these eigenstates is observ- 
able. Consider the state |Wpm). After noting Om = |Om) (Om|, we can immediately find that 
the collapsed state is exactly an eigenstate such that |Wpm) = |om). Therefore the average 
observed value of the operator (QO) is the sum ae Om Pr(Om), Which can be written as 


(0) = È om (Yom) (Oml Y) = (WIOLY) . (2.39) 


m 


This way of calculating the average of an operator can be readily extended to any positive 
integer power of the operator by raising the corresponding eigenvalues to the same power. 
Using a Taylor’s series to expand any function of the operator to polynomial powers, this 
recipe can also be used to calculate the average value of any functional of the operator. 

The preceding procedure for the projection operators can be extended to describe the 
POVM scheme as well. Here, rather than specific projective operators, we consider a finite 
set of observable operators Sm defined using (W|S,,|W) > 0. If these operators also form 
a complete set, a Sm = I. The condition, (W|S,,|Y) > 0, for each operator ensures 
that their eigenvalues are always real and positive. Therefore, it is possible to define the 
operators Mm = Sm, which satisfy all the conditions of the original set of operators 
Sm. Owing to the completeness property of this set, we obtain the analogous relation: 
i Mi Mm = I. As this set of newly defined operators has properties similar to those 
of projective operators, the same measurement procedure can be adopted to this case (see 
Aside 2.10). 


Aside2.10 Quantum Measurement of a Qubit [115] 


Classical information bits are represented by logic gates and take values 0 or 1 for their off 
and on states. We can designate these two possibilities as the |0) and |1) states, respectively. 
A quantum bit (or a qubit) is a generalization of this concept. The state of a qubit is a 
superposition state, œ |0) + £ |1), where œ and £ are two complex numbers such that |a|? + 
|8|? = 1. They are called the qubit amplitudes, and the associated orthonormal states |0) 
and |1) are called the computational basis. 


PM: Consider two measurements governed by the operators Og = |0) (0| and ©; = |1) (1|. 
They form a complete set because O}0o + oto 1 = I. Suppose the qubit is currently in 
the state |Y) = (0) + |1)). Then the probability of getting the state |0) is given by 


Pr(0) = (Y|O}Oo|¥) = 5. After the measurement, the prepared state, |W), will collapse 
to the state |0). Once the qubit is in this state, if we perform the measurement again, the 
qubit remains in the same state. This property of projective measurements is known as 
repeatability. Essentially, repeated projected measurements do not change the state. The 
same conclusion holds for the state 1 as well. 
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POVM: Consider a POVM containing three measurement operators: 


"E D) G|, M: ! (10) = 11) (0| = D, M3 =1— M- M 
~ 14/2 > = 24 24/2 > J= 2 3. 
(2.40) 


These three operators are positive and satisfy the completeness relation pD 1 Mi Mm =]. 
Thus, they are eligible as POVM operators. 


Assume that a qubit is prepared such that either |¥) = |0) or |W) = |1). We want to 
determine the exact state of the qubit by carrying out a measurement. If we calculate all 
the probabilities, we find that only (O|M1|0) = (1|M2|0) = 0, giving us the definitive 
information that determines the state of the system. The operator M3 fails to provide this 
information, as it gives nonzero probabilities for both measurements. 


Aside 2.11 Uncertainty principle [46, 96] 

Uncertainty and errors are intrinsic to physical measurements. In his book Philosophæ 
Naturalis Principia Mathematica [116], even Isaac Newton found it necessary to take 
into account measurement errors. In the context of modern physics, Richard Feynman 
considered in his “lectures on physics” the implications of the uncertainty principle 
for electrons through the Young’s double-slit experiment by demonstrating the impos- 
sibility of determining the path of an electron without disturbing the interference 
pattern. 


A popular version of the uncertainty principle states: one cannot simultaneously know both 
the position and momentum of a moving particle. This is often interpreted as implying that 
it is impossible to simultaneously measure the position and momentum of a particle with 
an arbitrarily high precision. This statement does not mean that a particle cannot possess 
a definite position as well as a definite momentum, but only that the uncertainty princi- 
ple prevents their simultaneous measurements, even if we work with a perfect measuring 
apparatus. Therefore, one must consider the standard deviation arising from an ensemble 
of similar measurements. 


The most generic version of the uncertainty principle is based on properties of observable 
operators. Consider two observable operators A and B. The Cauchy—Schwarz inequality 
[117] implies that 


| (YIABIW) |? < (YIA7 |) (WIB?|W), (2.41) 
where |) is a quantum state of the system under consideration. If we define the commuta- 


tor for these operators as [A, B] = AB — BA, and the anticommutator as [A, B] = AB+ BA, 
we can write the identity 


1 1 
| (W|AB|W) |? = zl (VITA, BIW) > + z! (WIA, BLY) i" (2.42) 
If we combine Eqs (2.41) and (2.42), we obtain 
1 
(WIA2|%) (YIBE) > —| (WTA, BIW) 1°. (2.43) 
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This is the uncertainty principle in its most general form. It can be recast as a condition on 
the standard deviations of the measurements on the operators A and B by replacing them 
with the operators A — (A) and B — (B) in the Eq. (2.43). 


2.2.3 Density Operator 


An alternative description of quantum states makes use of the density operator, or its rep- 
resentation in the form of a density matrix [118]. The utility of this operator comes from 
its property that it can handle the pure states, as well as mixed states for which the state of 
a system is known only in a probabilistic sense among several possible states. 

For a pure state |W) with (Y|W) = 1, the density operator p is defined as 


p = |W) (Y| (for a pure state). (2.44) 


One property of this density operator is that Tr (o) = 1 when the pure state is normalized 
properly. In an orthonormal basis based on the states {|n)}, the density operator becomes a 
density matrix with the elements py, = (m|o|n) . The trace operation Tr(e) is then equal 
to 2 Pnn, Or the sum of all diagonal elements. It is easy to show that the density matrix 
of a pure state is Hermitian (ot = p), idempotent (o? = p), and positive definite (i.e., 
(¢le|) = 0 for any state |¢)). 

The preceding definition of a density operator concept can be extended to a common 
scenario where a quantum system could be in a number of pure states, but we only know 
the probability of finding the system in each state. If the system can be found in the pure 
state |W,,) with the probability pm, where m identifies one among all possible quantum 
states, the density operator of the system is defined as 


p= 5 Pm|Ym) (Ym| (for a mixed state). (2.45) 


m 


A good example is provided by the Stern—Gerlach device [119], where we do not know 
the exact state of each spin 5 particle, but the physics of the problem tells us that the spin 
may point either “up” or “down” with equal probability. The density operator of a mixed 
state is also Hermitian, positive definite, and satisfies Tr(o) = 1. However, it fails to be 
idempotent because p? Æ p and Tr(p~) < 1. Aside 2.12 provides examples of the density 
operators in the pure and mixed states. 


Aside 2.12 Density Operators of Pure and Mixed States [118] 


Consider a two-level quantum system with pure states |0) and |1). Consider a new pure 
state, |W), that is a superposition of these two states (i.e., |¥) = a|0) + b|1)). The nor- 
malization condition, (¥|W) = 1, demands that the complex numbers a and b satisfy the 
relation |a|? + |b|? = 1. The density operator for this pure state case is given by 


p = |W) (| = Jal? 10} (0| + ab* |0} (1| + a*b |1) (0] + |b|? 11) (11. (2.46) 
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It is common to introduce a column vector V containing two elements |0) and |1), and 
write the preceding equation as p = VpV", where the density matrix ð is given by 


- ja? ab* 
a ibj2 |” (2.47) 


However, if we create a mixed state using the pure states |0) and |1), its density matrix 
takes the form 


2 
p = lal? 10) (0| + Ib? 1) (| > cole mae (2.48) 


The interference terms, or the off-diagonal terms, are absent in this density matrix. 


As we have seen in Aside 2.12, the density matrix of a quantum system clearly con- 
veys the difference between the pure and mixed states. In a mixed state, probabilities are 
added very much like in a classical system, while in a pure state probability amplitudes 
are superimposed, resulting in the nondiagonal elements of a density matrix. The transi- 
tion from a pure state to a mixed state can be observed through the loss of nondiagonal 
elements in a density matrix. Also, it is important to realize that a mixed state is different 
from a pure-state superposition of the form |W) = ae Pm |m), because all states mak- 
ing up this superposition are simultaneously present. The probabilities appearing in the 
coefficients are very much the deterministic coefficients making up this superposition. The 
density matrix for this state has the form 


p=|¥) (Y| = YS pm |Win) (Yml + > VPmPn Ym) (Yal - (2.49) 
m mézn 
The presence of interference terms in the preceding equation indicates that this is a pure 
state (and not a mixed state). 
It is possible to quantify the purity of a quantum state by introducing the concept of von 
Neumann entropy S(p), defined as [120] 


S(p) = — Tr(p logy P) = — È ` Am logy Am, (2.50) 
m 


where Àm are the eigenvalues of p. If a eigenvalue vanishes, we drop that eigenvalue from 
the sum, just as is done in classical information theory. The von Neumann entropy is zero 
for pure states because the associated projection operator has à} = 1 as the only nonzero 
eigenvalue. However, for a maximally mixed state that corresponds to complete ignorance 
about the N mutually exclusive pure states, A, = 1/N in Eq. (2.50), resulting in S(p) = 
log, N. This is the maximum value that S(p) can take. Any other scenario would have a 
value for S(p) between 0 and log, N. The von Neumann entropy is essentially a quantum 
analog of the well-known Shannon entropy used in communication theory. 

Another method sometimes used for quantifying the purity of a state is to calculate the 
purity function P(e), defined as [121] 


Pipa Tip = > Ay, (2.51) 
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where Àm are the eigenvalues of p. This function has a value between 0 and 1 for any 
density operator. It is exactly equal to 1 for a pure state and provides a convenient way to 
detect the purity of a state without much calculations. 

The density operator provides an easy way to calculate the average value of an operator, 
regardless of the pure or mixed characters of the states involved. Unlike the pure-state case 
where one can use the operation, (©) = (W|O|W), to find the average of an operator O, 
no single |W) exists for a mixed state. However, it is possible to generalize the averaging 
concept by adopting ideas from the classical probability theory. The modified procedure 
consists of simply weighting the expectation values (Y,|O|Win) for each of the pure states 
|W) contained in the mixed state by their respective classical probabilities pm and sum the 
results over the entire ensemble to obtain 


(O) = X Pm (Ym|O|Wm) = Te(pO). (2.52) 


The result (©) = Tr(pQ) also holds for pure states. Thus, the density operator provides us 
with a unified way to calculate the average value of an operator. 

The utility of the density operator becomes apparent especially when one needs the 
description of individual subsystems of a composite quantum device. The density operator 
of the composite device “AB” made by combining the subsystems “A” and “B” is provided 
by the tensor product pag = pa ® pg. We can recover the individual components, p4 and 
pp, from pag by using pa = Trg(pag), where Trg(---) is a partial trace operator over the 
subsystem B. When taking the partial trace, the probability amplitudes of B vanish given 
that Tr(o) = 1 is true for any density operator. This procedure is called tracing out over 
a subsystem. An interesting application of this “tracing out” operation is purification in 
which the partial trace operation is used on a mixed state to generate a pure state. Suppose 
pa is a density operator in the Hilbert space Ha with N dimensions. Then, it is possible to 
find a Hilbert space Hpg and a pure state |W) € Ha ® Hpg such that the partial trace over 
|W) results in p4 (i.e., Trg(|W) (¥|) = pa). 

In the case of multiple subsystems, if each one of them has the density matrix p, (n = 1 
to N), we can write the density matrix of the composite system as pc = p1 @(2...@ pn. If 
all subsystems are identical but independent from each other, we can write the composite 
density matrix as pc = p®. 


2.2.4 Canonical Quantization 


We ask how quantum mechanics can be used for a physical device requiring quantum 
description. This seems a difficult task because quantum-mechanics results are not always 
intuitive or close to our day-to-day experience. The recipe comes from a procedure called 
canonical quantization, which provides a mathematical mapping from a classical system 
to the corresponding quantum system [122, 123]. However, “quantization” is not a well- 
posed problem, and there is a plethora of techniques developed over time. Here, we sketch 
one of the simplest recipes. 

The process begins by defining a Hilbert space H associated with the system. Quan- 
tum states of the system are represented by state vectors, and observables by Hermitian 
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operators, in this Hilbert space. The energy of the system is then expressed as a Hamil- 
tonian, which is found using classical arguments with the aid of generalized coordinates 
and momenta (q and p). The next step maps all q and p to operators as (a hat over these 
variables denotes their operator character) 


q> ĝ; p > p=-ih—. (2.53) 


We also convert classical Poisson brackets to operator commutators (Lie brackets of quan- 
tum observables) as follows (more accurate versions of these relations are called Weyl 
relations) [122, 124]: 


lâm, Pn] = iħômn, lâm, n] = 0, [Êm, Pn] = 0. (2.54) 


There are many subtle ambiguities in this conversion process. For example, classically 
identical terms such as PmÈ and BPm are not the same in their operator versions because 
these two operators are not Hermitian. One way of overcoming this problem is by replac- 
ing each one of them with the symmetric form 5 (Png + Gm). Many other ambiguous 
steps need to be tackled using arguments based on the axioms of quantum mechanics. 
The most crucial and nontrivial step is the introduction of the numerically small but 
nonzero parameter fh = h/(27), where the Planck constant A is a hallmark of all quantum 
systems [125]. 

Once the Hamiltonian has been constructed using the preceding recipe, it becomes 
possible to invoke the machinery of quantum mechanics for fully describing a 
quantum device. Interestingly, the results obtained using classical mechanics nearly 
hold in most cases, except for small -dependent corrections. This is the basis 
for the correspondence principle, which states that quantum mechanics approaches 
the classical description if fh-dependent contributions are negligibly small. Aside 
2.13 shows an application of this principle using a classical pendulum as an 
example. 

One may ask why the Hamiltonian is preferred over a Lagrangian in quantum mechanics. 
In the Lagrangian formalism, the operators q and q do not commute and the Hamilto- 
nian defined by the Legendre transformation is not uniquely determined. Moreover, it is 
considerably harder to set up equations of motion using A and EA when q and q are oper- 
ators. These difficulties are remedied by adopting the Hamiltonian, which provides us a 
simple way to set up the equation of motion for any function x of p and q variables as 
ihx = [x, H]. In this formalism, conserved quantities are found by setting x = 0, or by 
using [x, H] = 0. 


Aside 2.13 Canonical Quantization Example 

Consider a pendulum made of mass m, hung from a rigid rod of length /. The pendulum 
traces a circle as it rotates in the vertical plane. If we use polar coordinates, a single gen- 
eralized coordinate 0, representing the angle from the vertical, can describe the pendulum 
motion. The first step is to calculate the kinetic and potential energies classically when the 
pendulum is located at an angle 0. Noting that the position x = 10, the kinetic energy is 
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found to be T = 5m 6". The potential energy at this position is V = mg/(1 — cos 0). The 
Lagrangian for this pendulum is given by 


; 1 i 
L(0,ĝ)=T-V= snl’ À? mgl(1 — cos 8). (2.55) 
The canonical momentum associated with the angle 6 is given by pọ = or = m6. Using 
this result, the Hamiltonian H can be written as 
: : pa 
H(pe,9) = ped — LO, 8) = 2 + mgl(1 — cos 0). (2.56) 
m 


For a quantum description of this pendulum we map @ and pg to corresponding operators, 
6 and Po, and impose the commutation relation (0, Po] = ih. If we assume that the angle 6 
remains relatively small in practice and use the series expansion for cos 0, the Hamiltonian 
can be written as H = Ho + 6H , where 
~ 2 

Êo = i + mel j2, (2.57) 
and 5H contains the higher-order terms. By dividing the Hamiltonian into a harmonic part 
Ap and a nonharmonic part dH, we can use the known techniques to treat each part sepa- 
rately. In particular, the nonharmonic part can be treated using the perturbation techniques 
discussed later. 
Inspired by the quantization of harmonic oscillators, we introduce the creation operator at 
and the annihilation operator a using the relations 


TELET, dp | at ) (2.58) 
= ap 4 a) and pg = 1 z“ a). : 


It is easy to verify that [a,a"] = 1 provides the correct commutator relation [ĝ, Pg] = ih. 
The parameter b is found using Ao. Substitution of the preceding variables in Êlo shows 
that, if we choose b = mg,/gl, Ĥo can be written in the form 


` 1 
Ho = hw (aa ap z) : (2.59) 


where we used the classical expression w = ./g// for the angular frequency of a harmonic 
oscillator. It is common to introduce ñ = ata as the number operator and write Ho as 
Ho = ha(n+ $), enabling us to use the full machinery of quantum mechanics. 


2.2.5 Time Evolution of a State Vector 


QM Postulate 3 The evolution of the state vector of a closed quantum system from its 
initial value |Y (tọ)) at time t = to is governed by a unitary transformation U(t, to) such 
that the state vector at time t is |V(t)) = U(t, to) |Y (to)}). 


The unitary operator U(t, to) governs the entire dynamics of a quantum system, subject 
to the restriction that the system is closed (i.e., it is not interacting with any other system). 
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No quantum system is really closed if it is interacting with its surroundings. However, it 
is possible to isolate a system to the extent that, to a very good approximation, it can be 
viewed as a closed system. It is also possible to consider it a part of a larger closed system 
that is undergoing unitary evolution. 

There are several representations (sometimes called pictures) of the unitary operators 
in quantum mechanics [46, 95, 96]. It is possible to establish connections among them by 
exploiting properties of unitary transformations. Three commonly used representations are 
known as the Schrédinger picture, the Heisenberg picture, and the interaction picture. The 
Schrödinger picture is mainly used when the Hamiltonian describing a quantum system 
does not depend on time. The other two pictures have features that make them attrac- 
tive for time-dependent Hamiltonians. The Heisenberg picture is more closely related to 
classical mechanics, where the dynamical variables (which become operators in quantum 
mechanics) evolve in time. The interaction picture is the preferred choice when quantum 
particles are interacting with an electromagnetic field, and the Hamiltonian is explicitly 
time dependent. 


The Schrodinger Picture 


The Schrödinger picture is preferred when dealing with a Hamiltonian with no explicit 
time dependence (0H/dt = 0). In this picture, quantum states evolve with time but the 
operators do not change with time. The evolution of the quantum state |Y (t)) is governed 
by the Schrédinger equation 


d|¥(2)) 
dt 


ih = H |Y (t). (2.60) 
This equation has the solution |¥(t)}) = exp[— iH(t — to)] |¥(to)). It follows that the time 
evolution is governed by the unitary operator U(t, to) = exp[— H(t — to)]. It is easy to see 
that U(t, to) satisfies 


aU (Et, t 
ao = HU(t, to), (2.61) 
with the initial condition U(to, tọ) = 1 (i.e., the time-evolution operator itself satisfies 


the Schrödinger equation). The preceding equation remains valid even for time-dependent 
Hamiltonians, but it is not easy to solve it in such cases. 


The Heisenberg Picture 


In the Heisenberg picture, operators evolve in time but the quantum state remains 
unchanged. However, even though the operators become time dependent, they continue 
to satisfy the commutation relation such as [q(f), p(t)] = if at all times. The time evolution 
of any Heisenberg operator O(t) is governed by 


inLow = 2 og + [O(t), H], (2.62) 
dt ot 
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where the Hamiltonian H(q, p) is a function of both q and p that vary with time. The 
Heisenberg picture can be used for all Hamiltonians irrespective of whether or not H 
depends on time explicitly. 

If the Schrödinger and the Heisenberg pictures are used to describe the time evolution 
of the same quantum device, they should make the same predictions for any measurement 
made on this device. This requires that the matrix element of any operator between any 
two quantum states should be identical in the two pictures. Mathematically, 


(PaO OPO) = (Pat OAP) - (2.63) 


Using the relation |Y (t)) = U(t, to) |Y (to)) valid for any quantum state, the left side of this 
equation can be written as 


(Pa DIOA WoO) = (Valto) UŤ (t, to) O(to) UC, to)| Wo (t0)) . (2.64) 


where U(t, to) = exp[— iH (t — to)]. A comparison of the two preceding equations leads to 
the important relation, O(t) = U it, to)O(to)U(t, to), that connects the Schrödinger and the 
Heisenberg pictures. 


The Interaction Picture 


The interaction picture handles Hamiltonians for which time dependence can be separated 
out of the main Hamiltonian as a perturbation, resulting in the form 


H(t) = Ho + V), (2.65) 


where Ho is the time-independent part with known eigenstates |Wo) obtained by solving 
H |Yo) = Eo|Wo). The second part V(t) depends on time but it is relatively small and 
acts as a perturbation. Often this term represents interaction of a quantum device with an 
external field. 

In the interaction picture, both the quantum state |W(t)) and the operators O(t) are 
allowed to change with time as 


2 i x i i 
IČO) = exp (mor) VO), OW = exp (=m) OW exp ( is - Hot). (2.66) 
Substituting the form of |(t)) in the Schrödinger equation, we obtain 


d|W(t)) 
dt 


This equation is identical to the original Schrödinger equation with Y acting as an effective 
Hamiltonian. Its solution can be written as 


ih 


ose Z i i 
= POITO), Va) = exp (>ot) V(t) exp ( = - Hot). (2.67) 


WO) = TC, to) IÙ Cto), (2.68) 


where U(t, to) is a unitary operator with U(to, to) = 1. It is found by solving its evolution 
equation, 
q(t, to) 


ih Ai = V(t)U(t, to). (2.69) 
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It is not easy to solve Eq. (2.69) because the operator V varies with time. However, the 
concept of time ordering can be used to write its solution as 


> pft 
Ot, t) =T {exe (-; f arviry) (2.70) 
to 
where 7 is the time-ordering operator, whose properties are discussed in Aside 2.14. Its 
presence ensures that the operators are ordered from right to left such that their time 
arguments go from earlier times to later times. In many applications making use of the 
interaction picture, the perturbation vA is weak compared to Ho. To the first order, U(t, to) 
can then be approximated as 


; pt 
Ùt to) x1- 4 | dtV(T). (2.71) 
i Sry 


This simplified form of the time-evolution operator forms the basis for the Kubo formula 
in the linear response theory discussed in Section 3.2. 


Aside2.14 Time Ordering, Normal Ordering, and Wick’s Theorem [126] 


Time ordering: For a set of time-dependent operators, A = {An | n = 1,2,...,N}, the 
time-ordering operator 7 is defined as 


T {A1 A2... An} = 64pB1B2... By, (2.12) 


where the set B = {B, | n = 1,2,...,N} has the same elements as the set A, but they 
are arranged such that operators with an earlier time appear to the right of the operators 
with later times. The parameter ¢ 4 takes values +1 and depends not only on the num- 
ber of permutations but also on whether the particle involved is a fermion or boson. It is 
defined as 


—1 for fermions if the number of permutations is odd 
CAB = (2.73) 


+1 otherwise. 


Time ordering obeys the distributive property: 


1 1 1 2 2 2 1) 40) 1) 
T [APAP AR + AP AD. AP] =T [AG a cy} 


+T [APAD AP). (2.74) 
Normal ordering: Consider a set of operators, A = {An | n = 1,2,...,N}, where each 


element is an annihilation or creation operator. The normal ordering operator is defined as 
N {Aj A2...An} =: A1 A2... AN := C4cCiC2...Cn, (2.75) 


where the set C = {Cn | n = 1,2,...,N} has the same elements as the set A but the 
indices are permuted such that all annihilation operators appear to the right of all creation 
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operators. The parameter ¢ 4c is defined identical to ¢ 4. Normal ordering also obeys the 
distributive property: 


N {AP AP... AP + APA? APL aN [AP AD... AP] 
+N [AP AD... AP}. (2.76) 


Wick’s theorem: Wick’s theorem [126, 127] provides a way to translate from time to 
normal ordering. Time ordering only takes into account the time stamp of the operators. 
In contrast, normal ordering only takes into account the creation or annihilation character 
of the operators, regardless of the time stamp attached to those operators. Wick’s theorem 
states 


T{AiAa...An} =N Aida... A] +N fAidods... Av} +o (2.77) 


where all possible contractions are to be made. The contraction between two operators A 
and B is defined as AB = (0|7 {AB}|0), where |0) denotes the vacuum state. 
L 


2.2.6 Perturbation Theory 


There is a formal way to make approximations in quantum mechanics known as pertur- 
bation theory, and it is widely used in modeling quantum devices and for estimating their 
performance. The technique is somewhat different depending on whether the Hamiltonian 
depends explicitly on time or not. 

We consider here the time-independent perturbation theory [96]. It expresses the quan- 
tum states of a perturbed Hamiltonian, Hp + V, using the eigenstates and energies of its 
nonperturbed part Hp that are found by solving Ho |Wo) = E |Yo). It is possible to find the 
perturbed quantum state |) of the full Hamiltonian with the same energy E using 


|W) = |Wo) + K(E)V |W), (2.78) 


where the operator K(E) = (E — Hoy! is a type of Green’s function known as the 
Lippmann-Schwinger kernel [128]. Noting that det(E — Ho) = 0 (because E is an eigen- 
value of Ho), the kernel K is singular. This singularity is eliminated by replacing E with 
E + ie, where € is an infinitesimally small real number such that 


K(E) = lim 


— (2.79) 
e>0 E — Ho + ie 


As our goal is to express |) in terms of |Wo), we need to find a way to eliminate |Y) 
from the right side of Eq. (2.78). This is done by defining the transfer matrix operator T 
such that V|W) = T |W). Using it in Eq. (2.78) and simplifying we obtain a self-consistent 
equation T = V + VK(E + ie)T, which can be solved iteratively to get 


T = V + VK(E + ie)V + VK(E + ie) VK(E + ie)V+.... (2.80) 
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When the eigenfunctions of Ho form a complete and orthonormal set, it is possible 
to find matrix elements of the operator T, which essentially represent transition proba- 
bilities between any two eigenstates. Suppose {|o;)} is the complete set of orthonormal 
eigenfunctions of the unperturbed Hamiltonian Ho such that Hoo; = E;Wo;. The matrix 
elements |Yom) and |Wo,) can be calculated from Eq. (2.80) as 


(WomlT| Yon) = (Yom| V| Yon) + 5 (Pom| V| Yo) KE; + ie) (Yol V| Yon) +... (2.81) 
j 


This equation is known as the Born series for T-matrix [129]. The first three terms in this 
series are referred to as the first, second, and third Born approximations [96, 130]. The 
structure of each term provides insight into the underlying physical process. A nonzero 
Vim creates an intermediate state |Wo;), which lasts until it connects to another state | Yox) 
by the Lippmann—Schwinger kernel, and so on. The whole process starts at the initial state 
and continues until the kernel K hits the final state. The Feynman diagrams are traditionally 
used to represent such a process [131]. 

Equation (2.81) can be used to calculate the transition rate Tmn between any two 
stationary states using the generalized Fermi’s golden rule given by [132] 


20 2 
= (Wom|T| Yon) | ô(Em i En). (2.82) 


PFmn = 


Here we have used Fermi’s rule in the context of the T matrix. The original version, referred 
to as Fermi’s golden rule, differs from it by the substitution T > V. 


2.3 Quantization of Electromagnetic Fields 
—————— eee 


The design and analysis of quantum devices inevitably require consideration of their 
interaction with one or more external fields. Electromagnetic fields are the most widely 
manipulated fields in such devices. In this section, we consider how electromagnetic inter- 
actions are handled for nanoscale quantum devices. As all physical laws do not depend on 
the motion of observers in different inertial frames, Lorentz invariance plays a critical role, 
requiring that Maxwell’s equations be invariant under Lorentz transformations [94, 98]; see 
Aside 2.15. We start with the standard Maxwell’s equations and recast them in the covari- 
ant form. This approach provides us a way to present two different formulations found in 
literature. Familiarity with both formulations is necessary for modeling nanoscale quantum 
devices. 


2.3.1 Maxwell's Equations and Gauge Invariance 


In the Système Internationale (SI) units adopted in this book, Maxwell’s equations can be 
written as [133] 
OB 


V-B =0, VxE+ -50 (2.83) 
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1 JE 
V-E= p/o, VERS aa, ik 
where € is the vacuum permittivity, uo is the vacuum permeability, and c is the speed of 
light (c = 1/,/p0€0). Here E is the electric field and B is the magnetic flux density. It 
is easy to show that the charge density p and the current density J satisfy the continuity 


equation to ensure charge conservation: 


y.J+?? =0. (2.84) 
ot 
The fields E and B interact with a quantum device by exerting a force on every charged 
particle. This is called the Lorentz force and is given by F = q(E + v x B), where v is the 
velocity of the charged particle. 

Equations (2.83) are not covariant, as they do not retain their form after a Lorentz 
transformation [134, 135]. Aside 2.15 describes the Lorentz transformation and the ten- 
sor notation used hereafter. Noting that moving charges imply current flow, it is evident 
that a charge at rest in one inertial frame will appear as a current in another iner- 
tial frame. The quantity that satisfies this property is the contravariant current density: 
J” = (cp,J). It combines charge and current densities into a single 4-vector (a vector 
with four components) that ensures charge conservation under Lorentz transformations 
(see Aside 2.15). 


Aside 2.15 Tensor Notation and Lorentz Transformation [76, 77, 98, 136] 

We first discuss the concept of a 4-vector. The position vector in 3D needs three num- 
bers, x,y,z, to specify it uniquely. To unify space and time, a vector in four dimensions 
is introduced with the components x“ = (ct,x,y,z) to represent a space-time point. In 
the following discussion, the Greek indices (u, v,...) take four possible values (0, 1, 2, 3, 
whereas the Latin indices (i, j, k, . . .) take three possible values (1, 2, 3). The use of a super- 
script to denote a vector may be confusing to readers not versed in the theory of relativity. 
In fact, we need to use two kinds of vectors. A vector with a superscript is called a con- 


travariant vector; if a subscript is used, it is called a covariant vector. For example, the 
0s 
Ox" 


Tensors constitute a generalization of the vector concept and require multiple indices to 
represent them; the rank of a tensor equals the number of indices required to represent it. 
It is useful to employ Einstein’s summation convention on repeated indices whenever no 
ambiguity is likely to occur [137]. A useful tensor is the Levi—Civita tensor (also known as 
the alternating tensor). It is defined as 


gradient of a scalar quantity S is a covariant vector denoted as ôS = 


1, if even permutations put indices in the order 1,2,... 
eft = €j. = )—1, if odd permutations put indices in the order 1,2,... (2.85) 
0, if two or more indices are the same. 


This tensor satisfies the following useful identities: 


= ijk — sf gk j sk 
€ ik €imn = im ken E din Okm> E€ Eim = mô no nÒ m' (2.86) 


70 


Quantum-Mechanical Framework 


The space-time metric can be used to convert between the covariant and contravariant 


vectors. This metric is a second-rank tensor defined such that g,, = 0 when u Æ v. It 
can be represented in the form of a diagonal matrix with the elements g,, = +1, 8&4; = 
89 = 833 = —1. The signature of the space-time metric used here is (+, —, —, —), which 


corresponds to the hyperbolic representation of the Minkowski space [138]. The distance 
ds between two neighboring space-time points satisfies 


d? = dx, dx” = g dx" dx” = cd? — dx — dy — dr’. (2.87) 


We also use the following notation for derivatives: ð, = gia and ð” = 5. Using this 
notation, we can introduce the 4-velocity u“ with components (c, u), where u is the 3D 
velocity. The 4-momentum p“ can also be introduced using (E/c,p) = mou“, where p 
is the 3D momentum, E is the total energy, and mọ is the rest mass of the particle. The 
differential operators can be generalized to define the 4-gradient as 


a= aa = (-3.-¥) ‘ (2.88) 


where V is the usual 3D gradient vector defined as V = (4. Š. 2). The Laplacian 


operator V7 =V-V becomes the d’Alembertian operator in four dimensions and is 
defined as 


= 33, = gta, -1% _ y (2.89) 
50 0p =E p = TJR ; 
We can now discuss the Lorentz transformation. It takes the simple form: 
əx! 
/pe HU V H — 
xh = AP Xx", A= Jo (2.90) 
The inverse transformation can be written as 
- a ox 
ew a, At = av (2.91) 


Here A”, and AY v are two tensors whose matrix elements obey the relation AY ga = 

6",, which implies AA = J or A = A7!. Owing to these relations, a contravariant 4- 

vector transforms as A’! = A“,A” and A” = A” A”. In contrast, the covariant 4-vector 
1 AY — Av al 

transforms as A’, = A” A, and A, = A",A’,. 


The next step requires the introduction of vector and scalar potentials. Justification for 
such an approach is that some physical phenomena stem from these potentials. An exam- 
ple is provided by the Aharonov-Bohm effect discussed in Section 1.4. The relationship 
between the potentials and fields is governed by the relations: 


B=VxA, E=-Vo-—. (2.92) 
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An equivalent way to satisfy these relations is to introduce the gauge-field 4-vector as 
A" = (¢/c, A) and define the field-strength tensor F¥” as 


w — HA” — "AË, (2.93) 


The preceding definition automatically satisfies the two relations between the fields and 
potentials in Eq. (2.92). This can be seen by expanding F“” and noting that its elements 
contain the components of E and B as follows: 


0 E\/c Ey/c E3/c 
E\/c 0 —B3 B2 
Ez2/c B3 0 —B, 
E3/c —Bo Bı 0 


Fey = (2.94) 


The associated covariant tensor F, is found using the relation F y = 8,¢8ypF «P. Tt has 
the form 
0 E\/c E/c E3/c 
=E; /c 0 — B3 Bo 
Fie : 2 
MY —E4/c B3 0 —B, ( a 
—E3 /c —By Bı 0 


The main difference between F#” and F,,,, is a change in the signs of the components 


of the electric field. Owing to the simple structure of these tensors, we can extract the 
electrical and magnetic fields by using 


; 1 : 
Ei = cF = —cF*; B;= z (2.96) 


Using these definitions, we can combine two Maxwell’s equations, V - E = p/é€o and 
Vx B- + eS = uoJ, into one covariant equation 


Ə, FH” = pod”, (2.97) 


where the 4-current is defined as J” = (cp, J). The other two Maxwell’s equations in Eq. 
(2.83) can also be incorporated in the tensor equation 


ðs Fuv + Oph ue + Oph en = O. (2.98) 


o” uv vt opu 


The apparent complexity of this equation can be reduced by introducing a dual tensor, 
G F°". It enables us to write Eq. (2.98) in the compact form 


3p G" =0. (2.99) 


= 
uv T ZEuvot 


A direct evaluation of the terms of this equation confirms that it contains the remaining two 
Maxwell’s equations: V - B = 0, and V x E = —0B/dt. 

The gauge-field 4-vector A” = (¢/c, A) has some interesting properties. It is easy to see 
that B remains unchanged if we add to A the gradient of a scalar function. However, we 
also need to modify the scalar potential to keep the electric field unchanged. Such changes 
in the vector and scalar potentials are referred to as a gauge transformation [139]. For a 
given scalar function x, the gauge transformation is written as 


ð 
o> 6-2: A>A4Vx. (2.100) 
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These two relations can be combined in the following compact form: 
A” —> A" — 9” x. (2.101) 


It is easy to check that such a gauge transformation does not change the field strength 
tensor: 


FH” —> H(A” — 3” x) — 3” (A! — 3! x) > FH”. (2.102) 


This gauge freedom is often useful for solving specific problems because one can impose 
a suitable gauge condition to simplify the underlying mathematics. The gauge condition 
fixes the gauge and eliminates the redundant degrees of freedom, while ensuring that 
the observable quantities derived at the end of calculation remain gauge invariant [139]. 
Several gauge conditions are discussed in Aside 2.16. 


Aside 2.16 Common Gauge Transformations [140] 
Great care should be exercised when using gauge transformations because their incorrect 
use can lead to situations where the resulting equations, though mathematically correct, 
describe unphysical scenarios [141]. However, if one is prepared to accept this limitation, 
the results are often valuable for constructing simplified descriptions. For example, the 
widely used Coulomb gauge [142] assumes that the scalar potential is felt instantaneously 
at any location (i.e., it violates special relativity). However, it is widely used in quantum 
mechanics because it enables the static and dynamic interactions to be separated. Similarly, 
the Kirchhoff gauge [143] assumes imaginary propagation speeds; velocity gauge allows 
superluminal propagation; and a zero propagation speed is used in the static Coulomb 
gauge [144]. 
Lorenz gauge is defined by the gauge condition [145] 

aA" =0>V-A+ 5 0 (2.103) 
This gauge is Lorentz covariant and is widely used for quantization of fields. When it is 
used for the inhomogeneous Maxwell’s equations (with sources), it decouples the wave 
equations for the vector and scalar potentials such that 7A = uoJ and 3?¢$ = p/eo. In the 
4-vector notation, we obtain a single equation 9, Ə” A” = uoJ”. The continuity condition 
also takes the form, d“J u = 9. In the absence of charges and currents, the resulting homo- 
geneous equations are much easier to solve, increasing the value of this transformation. 
Caution should be exercised because the Lorenz gauge does not fully specify the scalar 
and vector potentials. Any function x with the property 7x = 0 applied to Eq. (2.101) 
does not alter the observed field values. 


Coulomb gauge is defined by the gauge condition [142] 
V-A=0. (2.104) 


Unlike the Lorenz gauge, the Coulomb gauge is not covariant. However, if the scalar poten- 
tial does not depend on time, this gauge becomes equivalent to the Lorenz gauge. It is 
widely used in quantum-mechanical calculations, where the vector potential is quantized 
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but the Coulomb interaction is treated classically. Under this gauge, the scalar potential is 
set by the Poisson equation, V? = p/éo, implying that ¢ is experienced instantaneously 
at any location. This can be clearly seen by considering the velocity gauge where the scalar 
potential is assumed to propagate with an arbitrary speed v, rather than the speed of light 
[142]. With this change, ¢ is obtained by solving 


1 3? cð 

y? sap e V-A+ 50, (2.105) 
where the velocity-gauge condition is also specified. It is easy to see that both equations 
reduce to those in the Coulomb gauge in the limit v — oo. As a result, the retarda- 
tion effects do not appear in the scalar potential if the Coulomb gauge is used. However, 
the associated vector potential can still exhibit the retardation effects. A special case of 
Coulomb gauge is the Weyl gauge (also referred to as the radiation gauge) where the scalar 
potential is set to zero (ġ = 0). 


In the classical electromagnetic theory, gauge invariance is realized through measure- 
ments of the electric and magnetic fields. Even though the scalar and vector potentials 
change, the associated fields stay the same in different gauges. However, when electro- 
magnetic theory is coupled with quantum mechanics, it becomes apparent that the vector 
potential can have measurable effects in phenomena such as the Aharanov—Bohm effect 
[146]. Therefore, attention is required when fixing a gauge to ensure that the predicted 
phenomena are physically realizable. The introduction of the gauge-field 4-vector A” in 
Eq. (2.93) reduces the number of redundant degrees of freedom in the field components of 
Maxwell’s equations and provides a complete set of dynamical variables for the underlying 
theory. 

Even though we have used E and B as the main vector fields in Maxwell’s equations, 
two other vectors, D = eg9E + P and H = B/uo — M, become important when electro- 
magnetic interactions are studied in materials. The quantities P and M are called material 
polarization and material magnetization, respectively. We have already seen that both E 
and B can be recovered from the second-rank tensor F’,,,. Similarly, we can incorporate 
pairs {D, H} and {P, M} into tensors H/,,, and P „p, respectively. The resulting relation is 


pv? 
A, = F,,/Ho + Pv- It can be expanded to reveal the following matrix equation: 
0 cD, cDy cD; 0 cP, cPy cP: 
ee ele ee eae 
—cDy H; O -H uo —cP, —M, 0 Mx 
—cD, —Hy Hy 0 —cP, My, -Mx 0 


It is easy to show by direct substitution that the four Maxwell’s equations inside a material 
can be written in the following compact form: 


0,8 =J”. (2.107) 
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2.3.2 Invariants and Lagrangian of the Electromagnetic Field 


The tensor notation helps us to find functions that remain invariant during the interaction 
of matter with radiation. All invariants are zero-rank tensors. A widely used technique for 
finding invariants makes use of the operator identity 0,0, = 0,0,,. As an example, if 
we deploy this identity in Eq. (2.107) after applying the operator ð, and note that H, = 


—H,,,,, we obtain an invariant, 0, J” = 0, which is the continuity equation reflecting charge 


conservation. Three other important invariants are: 
4 
Per = 2B? EB /c), PYG = --B -E), J,A" =pd—J-A, (2.108) 


where the last invariant corresponds to the energy density of a charge interacting with 
an electromagnetic field. The validity of these invariants can be easily established by 
evaluating each directly. 

One important Lorentz invariant of the electromagnetic field is its action S, which is 
defined using the Lagrangian £ as [147] 


S= f L(A, 3”A) dÎx. (2.109) 


The assumption that action is an integral over a Lagrangian that is a functional of the 
fields and their derivatives is called the locality hypothesis [148]. Also, it is important to 
recognize that some of these quantities are not physically observable (the definition of S is 
purely mathematical). Note also the four-dimensional integral resulting from the 4-vector 
x”. The 4-vector A depends on x” but the Lagrangian does not depend on x” explicitly. 
As before, action should be stationary (i.e., 5S = 0). We use this condition to derive the 
Euler-Lagrange equations in Aside 2.17. 


Aside 2.17 Euler-Lagrange Equation for Covariant Functions 


We start with the action defined in Eq. (2.109). Its variation can be written as 


as = f L sA, dea oa) a (2.110) 
=J (aa, a@ay IT ' 


We rearrange this expression by noting that (0"A,,) = 3” (8A): 


as = f oF a 3" | 9a) ma | = a aa eai 
=J (aa, * IA) A IAD) MPO 


The second term inside the integral is a 4-divergence. We use the divergence theorem (a 
generalization of the Gauss theorem) to show that it is zero because 5A,, vanishes on the 
boundary of the integral. As a result, we obtain 


ƏL n ac i 
5S =l ə 8A, d*x. (2.112) 
ðA, aA D A 
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As the action principle demands ôS = 0 for all variations of A“, we immediately obtain 
the Euler-Lagrange equation 


ac ac 
= (2.113) 
ðA, aA) 


The preceding derivation highlights that, if we add a divergence term to a known 
Lagrangian, the equations of motion resulting from that Lagrangian do not change. Recall 
the earlier discussion that many different Lagrangians can lead to the same equations of 
motion. For example, noting that the interaction of a charge with an electromagnetic field 
has the form J4A yw» the Lagrangian for this case can be easily obtained. 


Before we can apply the Euler-Lagrange equations, we need to find a Lagrangian appro- 
priate for Maxwell’s equations. This Lagrangian should be based on the 4-vector A”. Using 
the gauge invariance, Lorentz invariance, and the requirement that the equations of motion 
must be linear partial differential equations, the most general form of the Lagrangian can 
be written as 


L(A, 8A) = X1A"A , + x29, AINA, + x38, A3, A" + x40, A"), (2.114) 


where x1, X2, X3, and x4 are constants that need to be fixed by ensuring that this Lagrangian 
produces Maxwell’s equation correctly. As shown in Aside 2.18, Maxwell’s equations are 
obtained if we choose x; = 0 and x2 + x3 + x4 = 0. Using them, the Lagrangian has the 
form 


L(A, ðA) = x3(0,,A"0,A” — 0, AXO“A,) + x48, AH)? — 0, A”IHA,] (2.115) 


If we now use the definition of F“” in Eq. (2.93) and demand that all fields vanish at 
infinity, we can write it in the simple form 


1 
v Hv 
L(A, ðA) = -ggio (2.116) 


where the prefactor — A was chosen to yield the correct equations of motion. 

As we saw earlier, F” F, is an invariant and has the value F” F, = 2(B?/c?—E”). As 
shown in Aside 2.18, the Euler-Lagrange equations of this Lagrangian are the Maxwell’s 
equations. If the 4-currents are also added to this (see the discussion at the end of Aside 
2.17), the Lagrangian that would generate Maxwell’s equations interacting with charges 


becomes 


1 
L(A, aA) = -— FFH” — J,A®. (2.117) 
Aug " 


Aside 2.18 Lagrangian for the Electromagnetic Field [149] 
Using the Lagrangian given in Eq. (2.114), we can easily calculate the following two 
derivatives: 


A = 2x,A" (2.118) 
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IAD = 2x20, A” + 2x30"A, + 2x48" (8, 4”). (2.119) 
Substituting these derivatives in the Euler-Lagrange equation (2.113), we obtain 
2x14” = ð, (2x20, A” + 2x30"A, + 2x48", (ðs A7)) ; (2.120) 
which can be simplified to obtain 
X14 = xa(0 3, JAH + (x3 + x4)3H (ð, A7). (2.121) 


This equation can be put in the form 3” (3, A7) — (Ə” dn, JAY“ = 0 if we choose x; = 0 and 
X3 + X4 = —X2- 

To establish the equivalence between the preceding expression and Eq. (2.116), we need 
to show that they differ from each other by a divergence factor. As we saw in Aside (2.17), 
a divergence factor does not alter the resulting equations of motion. Using the expression 
for F”” as given in Eq. (2.93), we obtain 


F.,, FY” = (8, A, — 0,4, )(0%A” — aA") —> 2(8, A,O“AY — 8,4, 0A”). (2.122) 


This version can be recast as a sum of two terms where the first term is equal to Eq. (2.114) 
within a constant and the second term is a divergence that does not alter the Lagrangian. 
The final resulting equation is 


S 2 
F,,, FY” = 28,A,0"A” — 2(04A,,)° — 2, (A, 4A” — A”Ə"A ,). (2.123) 


The main reason for finding the Lagrangian in (2.116) is to use it for quantizing the 
electromagnetic field. As we have seen before, the process of quantization requires identifi- 
cation of the generalized coordinates and their momenta as well as finding the commutation 
relations among these variables. We can use the 4-potential A” for the generalized coor- 
dinates. As before, the generalized momenta should be related to a derivative of A”. The 
right choice is its derivative with respect to 39A”, resulting in the definition M” = a 
for the generalized momenta M”. 

Noting that the Lagrangian, L(A, 0,,A) = 5€0(E* — c’B), does not contain any terms 
containing @, the generalized momentum of A° is zero. Owing to this constraint, it is not 
possible to elevate all A” to operator status. This can be understood by noting that A” are 
not the observable variables in Maxwell’s equations. We can address this issue by adopting 
a gauge condition to eliminate any redundancies. The natural choices are (i) the radiation 
gauge or (ii) the Lorenz gauge. It is a nontrivial task to prove that these gauge conditions 
do not alter any of the physically observable properties of the electromagnetic field. In 
practice, it suffices to demonstrate that different gauge choices yield results that differ 
from each other at most by a unitary transformation. 


2.3.3 Quantization in the Radiation Gauge 


In this subsection we adopt the radiation gauge to quantize the electromagnetic field 
[150]. The general strategy we adopt is similar to the process we followed earlier for the 
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quantum-mechanical formulation of a nanoscale device. In that case, the classical descrip- 
tion employed the generalized coordinates and momenta (q“, p”), and we mapped them to 
the corresponding Heisenberg operators (g", p”) and imposed the commutation relations: 


[q",p’] = ihô””, [q",q”] = 0, [p“,p”] = 0. (2.124) 


In the 4-vector notation, the generalized coordinates become q” = ad x! j x2) = (t,x). 
The corresponding generalized momenta are p” = (p?, p), where p? = E and E is the 
energy of the object being quantized (see Aside 2.15). 

The same strategy can be used for quantizing the electromagnetic field. As the 
Lagrangian is described uae the 4-potential A“, we use the mapping: q” — A” and 
p” — M", where M!” = Er In close analogy, the commutation relations at any time t 
are 


Tear 


[A“(t, x), M” (t,y)] = ih8 sO (x — y), [A“,A"]=0, [M“,M’]=0, (2.125) 


where all operators are evaluated at the same time t. Recall that, following the notation in 
Aside 2.15, x“ = (t,x) and y“ = (t, y). 

The radiation gauge fixes the redundancy in A” by imposing the gauge conditions: ¢ = 0 
and V - A = Q0. As these conditions are clearly not Lorentz covariant, we must explicitly 
verify at the end of the quantization process that the outcome of any experiment does not 
depend on the inertial reference frame in which it was observed. In the radiation gauge, 
A? = 0. The remaining three components satisfy the equation of motion a” 0,,A” = 0 for 
v = 1, 2,3. This equation can be solved by taking the Fourier transform of A” with respect 
to the spatial coordinates as 


A” (k) = f A” (x)exp(—ik - x) dx, (2.126) 


where k = (kı, k2, k3). It is easy to see that A’ (k) satisfies (o? + k2)A(k) = 0. This is 
an ordinary differential equation with the solutions of the form exp(tikx®) or exp(tiwt), 
where w = ck is the frequency of the electromagnetic wave and k is its propagation 
constant. Choosing the negative sign and taking the inverse Fourier transform, we obtain 


A(t,x) = f A (k) exp(ik - x — iwt) d°k, (2.127) 


1 
nF 
where A(k) = A*(—k). It is clear that both k and —k provide independent solutions to this 
equation. Therefore, their superposition represents the general solution of this equation. 
Noting that the condition V - A = 0 implies k - A(k) = 0, two polarization unit vectors 
e; (k) are introduced such that e t (k) -k = 0, where ¢ = 1,2. Owing to the radiation gauge 


condition k - A (k) = 0, A(k) must be a linear combination of e, (k) and e,(k): 
A(k) = e1 (k)aı (k) + e2(k)a2(k), (2.128) 


where a and az are the complex amplitudes of this linear combination. Using these results, 
the general solution for A(t, x) can be written as 


A(t, x) = (ax EF. a J e% | [ace 0D) + a% (kje a @k. (2.129) 
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To proceed further, we make A(t,x) a Hermitian operator. This can be achieved by 
making each amplitude a Heisenberg operator a;(k). The conjugate of the amplitude, 
az(k), is mapped to the adjoint Heisenberg operator âi (k). They satisfy the following 
commutation relations: 


[âr âf K] = OPO -k [Be(K),aer(k’)] = [âj ai w] = 0. 
(2.130) 


These commutation relations are central to the quantization of an electromagnetic field 
because they enable us to map its classical description to the quantum domain. To see 
whether we have succeeded in this mission, we need to find the generalized momenta M” 
and test for the commutator relationships given in Eq. (2.125). 

If we pick A?, the associated momentum M? = 0 for the Lagrangian given in (2.116). As 
A? = 0 in the radiation gauge, both A? andM° can be discarded as dynamic variables. This 
is a consequence of reducing the degrees of freedom of the vector potential by fixing it to 
the radiation gauge. For the three remaining components A! (i = 1, 2,3) we find MÌ = —E', 
where F' is the ith component of the electric field. After some tedious calculations, it can 
be shown that the commutator [A‘(t,x), E(t, y)] = —ihd”5°)(x — y), as expected for the 
quantization scheme used here. 

We can now construct the state vectors relevant to our quantization scheme. We start 
by defining the vacuum state |0) in the Hilbert space (known as the Fock space) as 
Gz(k)|0) = O for all k and ¢ = 1,2. We can construct the next state |k) by using the 
creation operator as a; (k) |0). The remaining states in the Fock space are generated by 
using the same procedure. The associated Hamiltonian can be obtained by normal ordering 
of the classical expression for the electromagnetic energy Eem = 5 f(E-E+B-B) dx. 
The results are given by 


@k 
(2x)? 


h 
H= = 2 Í [a} (k)az (k) + aç (k)a} (k)] (2.131) 


c=1,2 


This expression can be simplified using the commutator relations given in Eq. (2.130) to 
obtain 


ak 
H= f 2, a Was) T (2.132) 


The operator al (k)aç (k) is the number operator for photons of momentum p = hk. The 
single-particle states containing such photons are generated from the vacuum state through 
V2hwa} (k) |O) and are identified with photons of momentum p and energy fiw. 

We discuss briefly what happens when the Lorenz gauge is used in place of the radia- 
tion gauge to quantize the electromagnetic field. Since this gauge is covariant, the scalar 
potential does not vanish and A? has a finite value. However, regardless of the value of A°, 
the associated conjugate momentum vanishes for the Lagrangian given in (2.116), and it is 
not possible to define the commutator relationship for them. To alleviate this problem, we 
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modify the Lagrangian (2.116) by adding a divergence term that does not alter the resulting 
Euler-Lagrange equation. The modified Lagrangian has the form 


1 k 
LAA) = — 7 Fy PM + 5 (3, A"). (2.133) 


Here the variable à plays the role of a Lagrange multiplier. The simplest choice is to set 
à = 1. However, if one uses this Lagrangian for quantizing an electromagnetic field, non- 
physical states are found in addition to the physically valid states. A technique known as 
Gupta—Bleuler quantization shows how to recover physical states using properties of the 
Fock space [150]. With this technique, one can get nonzero conjugate momenta for all four 
dynamic variables. 


2.4 Second Quantization 
| 


In the preceding section we quantized an electromagnetic wave by using commutator rela- 
tions that were similar to those used for describing the quantum behavior of a particle. 
One distinctive feature of this quantization procedure was that we used the creation and 
annihilation operators, often referred to as the ladder operators, that enable one to build 
the system states starting from the vacuum state. In this section we focus on the single- 
particle Fock states and build many-particle states by filling up each single-particle state 
with a certain number of identical particles. This formalism is known as second quantiza- 
tion [151, 152]. It plays a significant role in any study of many-particle systems that are 
central to modeling quantum devices. In particular, it is the method of choice for indistin- 
guishable particles because the traditional wave function is not suitable for carrying out 
a complex symmetrization procedure needed to describe many-particle states. The second 
quantization method addresses this issue by counting the number of particles that occupy 
each state. Because this process does not refer to any labeling of particles, it contains no 
redundant information and provides a compact description of many-particle states. 


2.4.1 Many-Particle States for Fermions and Bosons 


We begin with single-particle (Fock) states |n), where n is any positive integer. The single- 
particle states of several particles are written as |n)¿, where ¢ takes integer values 1, 2,.... 
They can be used to write the occupation-number representation of a many-patticle state 
consisting of nı particles in the state |n1),, m2 particles in the state |n2)5, and so on. Here, 
we have adopted a notation where a subscript outside the ket identifies the particle, but 
a subscript within the ket identifies its state. It is common to write such a many-particle 
state as 


[7c ]) = |mj,n2,...,M¢,---); (2.134) 
where N = >> ¢ Ng is the total number of particles. Owing to the Pauli exclusion principle, 


nç can take only two values (0 or 1) for fermions but it can be any nonnegative integer 
for bosons: 
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0, 1 for fermions, (2.135) 
ne = . 
t 0,1,2,3,... for bosons. 


Among this set of many-particle states, some states have special significance. The state 
with all occupation numbers equal to zero is called the vacuum state |0) = |0,0,...). A 
state with one non-zero occupation number is denoted as |n¢) = |0,0,...,n¢,...). The 


symmetrized and antisymmetrized multiparticle states are defined using the S+ and S_ 
operations as 


1 
S+ |mj,nz,...,N¢,...) = TN So IY? n,m, ne) : (2.136) 
PeSyn 


where the summation is taken over all the N! elements in the permutation group Sy and 
the permutation operator P has the property 


P 1 for even permutations 
CD = (2.137) 
—]1 for odd permutations. 


Using these definitions, the many-particle states of the bosons and fermions can be 
written as 


1 
O TET T = Sa |mj,72,...,N¢,.-.), (2.138) 
i ni!nm!...ng!... 


where + correspond to bosons and fermions, respectively. Notice that the denominator in 
the preceding equation reduces to 1 for fermions in view that nj! = 1 (n; = 0 or 1 for all j). 

In practice, dealing with the sum of permutations in Eq. (2.136) is cumbersome and 
inefficient. A better way is to look for an algebraic operation that inherently encodes the 
symmetry or antisymmetry property and frees us from this unwieldy representation. The 
technique offering this possibility uses operators that either lower the number of particles 
in a given state by one (annihilation operator) or increase it by one (creation operator). 
The creation and annihilation operators are different for bosons and fermions because their 
states have different symmetry properties, as described earlier. 

We define the creation operator â p and the annihilation operator â t for every ¢ as 


~~ 


y = Vie + (ED |r, ne +1...) (2.139) 
i (2.140) 


a t 
a, |N1,12,...,Ng,..- 


a, isa -3ng ohn = SED his n, ng — Wass) 


where s¢ = Da nj. The quantity s¢ is also known as the Jordan—Wigner string in the 
case of fermions. Its value not only demands a predefined ordering of the single-particle 
states but also requires knowing the fermion occupation numbers of all the preceding states. 
Owing to this dependency of the local state on the fermion group considered, the creation 
and annihilation operators are considered nonlocal for the fermions in some sense. Recall 
that the occupation number n¢ is either 0 and | for fermions. It is possible to show that 


these operators satisfy the following commutation relations: 


CA = (4, aj)2=0, [4),4,"le = dix, (2.141) 
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where we have used the definition [w, 8]; = œp F Ba. Applying the creation operator 
repeatedly to the vacuum state, it is possible to generate every state vector for both bosons 


and fermions as follows: 


(@,")" 10), (2.142) 


1 
|71,N2,...,N¢,.--), = 
: t Iiz 


[inisa a 10). (2.143) 
¢ 


So far we have assumed that ¢ takes integer values. It is possible to extend the preceding 
formulation to the case where ¢ varies continuously. For this purpose, we replace the last 
commutation relation in Eq. (2.141) with [âp âp ]+ = ô(pı — p2), where pı and pz are 
two continuous variables representing each particle’s momentum. The resulting creation 


and annihilation operators can be transferred to the po ea space using the Fourier theory. 
We define new creation and annihilation operators, Ay and Ay , at a spatial point x using 


Ay = H” i Âp EXP (p : x) dp. (2.144) 
p 


One can easily show that following commutator relations are satisfied by these operators: 


the relation: 


A A A 


t Âe tle = [Ay, Ay, l+ = 0, L xpå 


= ô(X] — X2). (2.145) 


u lė 


Aside 2.19 shows an example where such operators are used for describing a quantum 
system. Their main advantage is that they can create or destroy a particle at a given location 
in the physical space (indicated by x). 


Aside 2.19 Position and Momentum Representations of a Particle’s Quantum State 


As we saw earlier in this chapter, a quantum state |¢) can be described in many different 
basis. If we use the position coordinates (x) for the basis using W(x) = (x|¢), we recover 
the wave function appearing in the Schrédinger equation. However, we can also use the 
momentum basis and form the function ®(p) = (p|¢). Since each representation forms a 
complete, orthornormal basis, we can expand ®(p) in the W(x) basis as 


(p) = (ple) = f (pIx) (x|¢) dx = f (pix) Yx) dx. (2.146) 


Thus, we need to find the quantity (p|x). We know from elementary quantum mechanics 
that the following Fourier transform exists: 


O(p) = f (xlt) exp ( 7 -p . x) dx. (2.147) 


This relation immediately implies that 


i 
(plx) = exp ( Pe x). (2.148) 
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In some cases p may take only discrete values. In this case the inverse Fourier relation 
becomes 


W(x) = D> &(p) exp (+p ; x). (2.149) 
P 


Consider a state |W) resulting from the application of operator Ax! on the vacuum state |0): 
|) = Ay* |0) = 2 ( = =P: -p-x)ap" 10). (2.150) 


As we discussed earlier, this operation should create a single particle at point, x. It is 
instructive to check whether this is indeed the case. We invoke the occupation number 
operator (see Aside 2.20), which counts the number of particles in the state |W). As the 
operator Âpo Âpo counts the number of particles with momentum po, we apply it to the 
state |W) to obtain 


2 âp, âp, |W) = Dew (- gr x) apg! ayy" 10). (2.151) 


However, we know from the commutation relations that (0|4p,4p' 10) = Ona’ Use of this 
result in Eq. (2.151) results in 


X ap, âp |W) > 1Y). (2.152) 
Po 
This result shows that the state |Y} is an eigenstate of the occupation number operator with 
an eigenvalue of 1. As the eigenvalue of the occupation number operator represents the 
number of particles in the associated eigenstate, the state |W) represents a single particle. 
Suppose this particle was created at some position xg. We can show this by considering the 
average (Xo|W): 


i x, @ i 
(xo) = $ exp ( - =P. x) (xodp" |0)| = $- exp | — =p: &- x0)]. (2.153) 
p p 
This expression reduces to 6(x—xo), confirming that the particle was created at the position 
Xo, as expected. 
2.4.2 Representation of Operators 
It is important to consider how various operators are represented using the creation and 
annihilation operators associated with second-quantization formulation. Suppose A is an 


operator for N particles that only depends on their coordinates. It can be written as a sum 
of individual single-particle operators as 


A= Ar. (2.154) 
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We can represent Aina complete basis using the matrix elements A “= (¢ |A|7) as 


A= Ze ) (IÂ In) ( 2 ae M> Ehn BH (nlk: (2.155) 


To represent the operator À using the annihilation and creation operators, we need to 
investigate its action on a general multiparticle state |nj,n2,...,n¢,...),. In the general 
case ¢ Æ 7, we obtain 


y [ede (al 
XO lO (nlk nnne) > DO KUE Sa m, m,n.) 
k=1 


= nlng!...n¢!... 


~~ 


(2.156) 
As shown in Aside 2.20, this equation reduces to the particle-number operator of that state. 
The same result holds for ¢ = 7 as well. Thus the operator A has the following form in the 
second-quantization representation: 


Â= 2 Acne â "à (2.157) 


where A in (¢ |A|n) corresponds to single-particle states. Essentially, the operator Aisa 
superposition over all processes that use a, to annihilate a single particle in the state |n), 


scatter it through the matrix element A tn and then use a A to create that particle in the final 
state |¢). Thus, the whole process can be viewed as a form of scattering process that brings 
particles from the initial state to the final state via all possible single-particle scattering 
events. 


Aside 2.20 Occupation Number Operator 


The occupation (or particle) number operator fiz for each state |¢) is defined as [153] 
ig = â, a, (2.158) 
such that 


Ne |, N2,- -3 Niven) a = Ag N1, M2,- Ats.) (2.159) 


The total occupation number (or particle number) for the state, |n1,n2,..., nç, ...)}, can 
be defined as Ñ = > z fic. It is straightforward to show that 


Ne |71,M2,-6.5,Mgy- dy = Yin [711,12,--.,Mg,-- da > N|ny,ng,...,Nz,..-)4- 


(2.160) 


The situation changes if we relax the single-particle association for the operator A and 
assume that it is associated with two interacting particles. Then the second-quantization 
representation of the operator is given by 
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A = a ta ta a 
AT D Ag omm€ ae, ahin? (2.161) 
bigonina 
where Ax eying = (6182 |A|nin2) for two-particle states. Very much like the single-particle 


operator, this representation embodies the idea that a two-particle operator represents all 
processes that annihilate two particles from their initial states, scatter them through the 


matrix element A Henin and create two particles in two new states. 


Aside 2.21 Tight-Binding Models 

For materials that are formed from closed-shell atoms or ions, the free-electron model 
is not adequate for describing the motion of electrons. Tight-binding models are simpli- 
fied band models for electrons in solids interacting only with neighboring atoms [154]. 
Even though they look deceptively simple, tight-binding models can be used to calculate 
intricate properties of solids such as surface states or plasmonic response. Unlike the free- 
electron models, the tight-binding models assume that an electron remains mostly bound 
to its own atom, except for an occasional transfer to a neighboring atom. A particularly 
simple example is given by the Hamiltonian 


A=-« Y Gta tâ â), (2.162) 
ij 


where å 


creates an electron on site i, while a; annihilates an electron on a neighboring 
site j. The product âà, describes intuitively the hopping of an electron from the site j to 
the site i. A positive value of the hopping parameter («x > 0) indicates that each hopping 


lowers the kinetic energy of the system. 

The Hamiltonian in Eq. (2.162) can be diagonalized by expanding â; into a Fourier series 
(owing to the periodic nature of the atoms in a solid), resulting in the operators Ck = 
yj ekg, where k is a vector in the reciprocal space. The Hamiltonian can then be 


written in the form H = Šk exc, êk In a one-dimension lattice, one finds that ek = 
—2x cos(kd), where d is the lattice spacing. 


Linear Response Theory 


The sciences do not try to explain, they hardly even try to interpret, they mainly make 
models. By a model is meant a mathematical construct which, with the addition of 
certain verbal interpretations, describes observed phenomena. The justification of such 
a mathematical construct is solely and precisely that it is expected to work - that is 
correctly to describe phenomena from a reasonably wide area. Furthermore, it must 
satisfy certain esthetic criteria - that is, in relation to how much it describes, it must be 
rather simple. 

John von Neumann 


3.1 Linear Response of a System 
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Linear systems have been studied in diverse disciplines including physics, mathematics, 
and engineering [155, 156, 157]. In electrical engineering, when electronic probing is used 
to monitor a system, its interaction with the probe is considered a small perturbation to 
the system; if it were not, we would not be probing the system, but the system would 
be modified by the probe [158]! Consequently, the results of probing can be expressed in 
terms of a linear response function that depends on the properties of the monitored system 
(but not on the probe). Owing to its fundamental importance, there is no doubt that the 
linear response of systems will continue to play an important role for as long as one can 
foresee. 

Linear response theory describes mathematically changes induced in the properties of a 
system as a result of external probing. The primary aim is to model the system’s response 
without considering details of the probe—system interaction. One surprising but highly use- 
ful result is that the linear response function of the system is predominantly determined by 
the eigenvalues and eigenfunctions of the unperturbed system [159]. As a consequence, it is 
possible to determine the eigenvalues (excitation energies) of a system from its frequency 
response to external probing. It should be stressed that many systems behave nonlinearly, 
and their response to an external probe is not always linear. However, by treating the exter- 
nal stimulus as “perturbative” (i.e., relatively small), linear response theory can be used 
for nonlinear systems as well. If this assumption appears to be too restrictive, one may 
resort to the Lagrangian formalism and solve the equations of motion as described in 
Section 2.1. However, owing to computational complexity of such an approach, we are 
likely to miss valuable physical insights into the system’s behavior. Linear response theory 
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makes a compromise by making reasonable predictions from general physical principles, 
albeit with some loss of accuracy [160]. 


3.1.1 General Formalism and Impulse Response 


The response of a linear system to an input signal v(t) is governed by [161] 


[00] 
y(t) al a(t, t)v(t) dt, (3.1) 
—c 
where y(t) is the output signal. The function g(t, t) is called the impulse response because 
it is the output one receives if the system is excited by an impulse. Its two arguments denote 
the time t at which an impulse was applied and the time ¢ at which the system’s output is 
observed. 

The infinite integration limits in Eq. (3.1) can be removed by considering the causality 
requirement. Suppose that the input was applied at t = fo. The system is called relaxed at 
time tọ if it has a null response before this instance. For a relaxed system, we can replace 
the lower limit of integration by fo. If the system is causal, then the output cannot occur 
before the input is applied. This can only be assured if g(t, t) = 0 when t > t. As a result, 
the upper limit of the integral in Eq. (3.1) can be replaced with t. Thus, the response of a 
relaxed, causal, linear system can be written as [155, 161]: 


t 
y(t) = f g(t, t)v(t) dt. (3.2) 
to 


Many linear systems are also time invariant in the sense that their impulse response does 
not change if all times are shifted by a constant amount, say te. This is possible only if 
g(t + te, T + te) = g(t, t). Choosing te = —T gives us g(t,t) = g(t — T,0). Replacing 
g(t — t,0) with g(t — t) for a time-invariant system, its linear response takes the form 
[161]: 


t 
y(t) = i; g(t — t)v(t) dt. (3.3) 
to 


As tọ is an arbitrary reference time, tọ = 0 is widely used, especially in engineering. With- 
out loss of generality, we adopt this convention here. To simplify the following discussion, 
we also assume that the input v(t) and the output y(t) are scalar functions of time. This 
assumption is not restrictive, and the conclusions reached here remain valid for the vector 
functions as well. With these simplifications, the response of a linear time-invariant (LTI) 
system is governed by 


t 
= f g(t—t)v(t)dt —> y(t) = g(t) * v(t), (3.4) 


where the compact form of y(t) makes use of the convolution operator * that is employed 
commonly to describe the input-output relationship for linear systems. 

It is well known that the convolution operation becomes a product if one takes the 
Fourier transform of all functions appearing in the convolution. The same property also 
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(Fourier) frequency domain 


P ae 


F{...} F 4 


time domain 


| l 
v(t) — at) Fe g(t) « v(t) 
| t 


Li] Lead 


V(s) G(s) G(s) V(s) 


(Laplace) frequency domain 


Relationship for a linear time-invariant system between the time-domain functions and their Laplace transforms in 
the s—domain or Fourier transforms in the œ domain. 


holds for the Laplace transforms. As the lower limit is not infinite in Eq. (3.4), it is often 
necessary to employ the Laplace transform of the input function defined as 


t 
LAOS) = V(s) = | v(t) exp(—st) dt, (3.5) 
0 


where s is a complex number and V(s) denotes the Laplace transform of v(t) in the s 
domain. By using the convolution property of the Laplace transform, Eq. (3.4) in the s 
domain takes the form 


LOI = LOLOL —> Ys) = G(s)V(). (3.6) 


In the case of Fourier transforms, s becomes purely imaginary (s = iq), and the corre- 
sponding relation is written as Y(w) = G(w)V(@). Figure 3.1 shows both of these relations 
in a graphic form. 

Laplace domain results can be converted to the time domain using the inverse Laplace 
transform, also known as the Mellin inverse formula [162]: 


+iw 
v(t) = Lvs} = lim E f V(s) exp(st) ds, (3.7) 
W—> 00 y 


201 i 


where the integration is done along a vertical line located at R(s) = y in the complex 
plane such that y is greater than the real part of all singularities of V(s). This ensures 
that the contour path lies in the region of convergence of V(s). If all singularities of V(s) 
happen to lie in the left half-plane, y can be set to zero, and the preceding integral becomes 
identical to the inverse Fourier transform. In practice, this integral is difficult to evaluate 
when y 4 0. Sometimes one can bypass the integral by decomposing V(s) into a sum 
of known transforms and construct the inverse by inspection. However, this “inspection” 
method is applicable only in a few limited cases. 


3.1.2 Equilibrium Ensembles in Classical Statistical Mechanics 


As the linear system is assumed to be in thermal equilibrium before the external perturba- 
tion is applied, we discuss the three equilibrium ensembles used commonly in statistical 


88 


Linear Response Theory 


ensemble (fixed) fundamental parameter total differential 
microcanonical (N, V, E) S = kp 1n (Sstates) dS = 1dE + 4 pdVv — 4 dN 
canonical (N, V, 7) F = —kpT ln (ZyyT) dF = —sdT — pdV + dN 


grand-canonical (u, V, T) pV = —kgT ln (Ząvr) dV) = SdT + pdV + Ndu 


mechanics. The microstates of any macroscopic system represents all possible states that 
it may take. An ensemble is an idealization consisting of virtual copies of all microstates 
of a macroscopic system, subject to the constraints imposed on the system [159]. Such 
an ensemble is used to calculate the probability distribution function, whose knowledge 
enables one to compute the average properties of the system. Ensembles considered in sta- 
tistical thermodynamics compute these probabilities by applying the laws of classical or 
quantum mechanics that govern the system’s evolution. There are three different ways one 
can formulate system’s equilibrium; these are known as microcanonical, canonical, and 
grand-canonical ensembles. For a macroscopic system in equilibrium, all three ensembles 
give the same final result. As such, choice of the ensemble is dictated by the nature of the 
physical system under consideration and the properties one is interested in. 

Microcanonical ensemble: A Microcanonical ensemble is based on the system’s total 
energy, governed by the Hamiltonian H(q, p), in the range [E, E+ AE] with a fixed number 
N of particles inside a fixed volume V. The system stays in equilibrium by not exchanging 
energy or particles with its environment. This ensemble assigns equal probability density, 
Peq(G, p), to every microstate in the energy range [E, E + AEF]. All other microstates are 
given a probability of zero, 


1/2, H(q, p) € [E, E + AE] 
Peq(q, p) = p (3.8) 
0, otherwise, 
where we have defined © as the phase-space volume of the region in which 
H(qgisintherangep) € [E, E + AE]: 


Q(E, E+ AE) = J dqdp. (3.9) 
H(q,p) 

Using the density of states in the form Dos(E) = f ô[E — H(q, p)] dqdp (see Section 1.3.2), 

the probability density can be written as peg(q, p) = ô[E — H(q, p)]/Dos(E). 

The fundamental parameter that relates the microcanonical ensemble to thermodynam- 
ics is the entropy (see Table 3.1). Based on information theory, entropy is a measure of 
uncertainty (or lack of information). If a system has many equally probable outcomes, 
say Nstates, then the entropy of the system is defined as S = kg In(Ngtates) [163]. This 
definition can be readily ported to the microcanonical ensemble by counting the num- 
ber of microscopic states in the energy range [E, E + AE]. This number is given by 
Notates = Q(E, E+ AE)/(N! WN ), where A is the Planck constant. Here we used the uncer- 
tainty principle (see Aside 2.11) of quantum mechanics stating that an infinitesimal volume 
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dq dp can only be located with an uncertainty of h*. We also used the fact that N particles 
can be arranged in N! ways. 

Canonical ensemble: A canonical ensemble is in equilibrium with its environment at a 
constant temperature T and has a fixed number of particles (N) inside a constant volume V 
[159]. This ensemble does not pose any constraints on the system’s total energy E£, which 
can be exchanged with the environment. The probability density for a system with the 
Hamiltonian H(q, p) is given by 


exp [—A(q, p)/kaT] 


Peq(, P) = : (3.10) 
ea: PY Texp[-H(@. p)/ksT]dqdp 
The partition function Zyyr for this ensemble is defined as 
= 1 H(q, p) 
ZNVT = f WaN P ( kT ) dqdp. (3.11) 


As seen in Table 3.1, this quantity relates to the Gibb’s free energy through the relation F = 
—kpT \n(Zyyr) and provides a way to link thermodynamic variables to the equilibrium 
distribution. 

Grand-canonical ensemble: A grand-canonical ensemble can exchange both energy and 
particles with its environment. It has a constant volume V at a constant temperature T and 
a constant chemical potential u [159]. The probability density for a system containing N 
particles with the Hamiltonian H(q, p) is given by 


exp (—[H(q, p) — LN] /kpT) 
a-o J exp (—[H(q, p) — uN] /kgT) dqdp` 


The partition function Z,,yr for this ensemble can be written using the partition function 


Peq(Q; P) = (3.12) 


Znvr of the canonical ensemble as 
oe) 
Zuvr = > exp (uN /kBT) Zyvr- (3.13) 
N=0 
As seen in Table 3.1, this quantity relates to the system pressure p through the relation pV = 
—kgT \n(Z,vr), and provides a way to link thermodynamic variables to the equilibrium 
distribution. 


3.1.3 Equilibrium Ensembles in Quantum Statistical Mechanics 


The linear response theory applicable to quantum systems considers near-equilibrium fluc- 
tuations, which are ensemble-averaged small changes induced by an external probe. The 
underlying assumption is that the macroscopic properties of a linear system are the results 
of ensemble-averaged microscopic properties. When a system is interacting with its sur- 
rounding environment, both its energy and the number of particles can fluctuate (because 
particles can either leave or enter from the environment). In either case, the system is con- 
sidered to be in equilibrium with its environment if its temperature and chemical potential 
(for each kind of particle) remain constant. Enforcing reasonable constraints, it is possible 
to adapt the classical equilibrium distributions to a quantum system. 
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We have seen in Section 2.2 that one can associate with any physical quantity O an 
observable ©, which is a Hermitian operator in the Hilbert space of the quantum device 
considered. When we carry out measurements, the only possible values for the physical 
quantity O are the eigenvalues ào of this Hermitian operator ©. This implies that, unlike in 
classical physics, the value of a physical quantity is not uniquely determined for a particular 
microstate. Rather, the probabilistic nature of the events needs to be taken into account 
using the Born interpretation of the state vector, as discussed in Section 2.2. A state vector 
does not describe the properties of one single device. Rather, it describes the properties 
of a statistical ensemble of such devices prepared under the same conditions. Knowledge 
of the state vector |W) enables us to determine the expectation value of the observable as 
(0) = (B|O|W). 

To extend the classical equilibrium concept to quantum devices, we define a quantum 
macrostate as an ensemble of microstates. Then, if the density operator p of the ensemble 
is known, the expectation value of O is calculated using 


(0) = Tr (p0). (3.14) 


Because the trace of an operator is independent of the basis used, Tr (oO) does not depend 
on the basis of the eigenvectors representing the underlying states. Here, the density oper- 
ator p corresponds to the probability distribution in classical statistical physics. Thus, it is 
important to realize that po performs two different roles simultaneously. First, it provides 
the quantum average on the state; second, it performs the statistical average on the state 
vectors of the environment. 

When we consider the density operator of the equilibrium distribution, denoted by peq, 
the corresponding ensemble must be stationary (i.e., doeg/dt = 0). It follows from the 
Liouville equation, 

ð 

ðt 
that peg commutes with the Hamiltonian of the ensemble that itself does not depend on time 
because of the equilibrium nature of the ensemble. This feature can be used to calculate 
the expectation value of O in the Heisenberg picture by using 


1 
Peq = zp eo Peal (3.15) 


(O)eq =TrlpegQOO], OA = UT(MOUM, (3.16) 


where U(t) = exp(— iHa). As the trace operation is invariant under a cyclic permutation, 
it follows that 


(eq = Tr | oeg U QOUO] = Te (eq) = eq (3.17) 


This result is expected because, even though the operator is time-dependent, its ensemble 
average over an equilibrium distribution should be a constant. 

To apply these considerations to the microcanonical ensemble, we assume that the eigen- 
values of the Hamiltonian H of a quantum device are given by the set {En} with the 
eigenfunctions |n) (i.e., H |n) = En |n)). When we want to identify the number N of parti- 
cles in the system, we modify the notation by incorporating N in the eigenvalue equation 
as H |n) = pE, |n) i.e., yE, is the nth energy level of a quantum device with N particles). 
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As the microcanonical ensemble has a constant number of particles inside a fixed volume 
V with a constant energy E, the matrix elements of its equilibrium density matrix can be 
calculated using 


(Peq)nm = (N|Peq|m) = (n|d(E — H)|m) = bE — yEp)ôn,m- (3.18) 


Owing to this result, the associated partition function, Zyv~e = Tr(0eq) = ae O(E — yE,,)s 
is equal to the density of states Dos(E). 

The canonical ensemble has a fixed number N of particles inside a constant volume V at 
a constant temperature T. Correspondingly, the matrix elements of the equilibrium density 
matrix are calculated as 


(Peq)nm = (n| exp[—H/(kpT)]|m) = exp [—yE,,/(keT) nm. (3.19) 


The partition function in this case is given by Zyyr = Tr (Peq) = yo, exp (- NEn /kpT). 

Finally, the grand-canonical ensemble has a fixed volume V at a constant temperature T, 
and a constant chemical potential u. The matrix elements of the equilibrium density matrix 
for N particles are calculated as 


(vPeq nm = (n| exp [—(H — WN) /kgT]|m) = exp[—(yE, — UN)/kBT]ônm. (3.20) 


The associated partition function for this case is given by 


Zur = X Tr (yeg) = X exp [—(E, — HN)/kBT]. (3.21) 
N N,n 


Aside 3.1 shows how these concepts can be used to derive the Bose-Einstein and Fermi- 
Dirac distributions. 


Aside 3.1 Fermi—Dirac and Bose-Einstein Distributions 

Indistinguishable particles play a central role in the operation of quantum devices [164]. 
As discussed in Section 2.4, such particles are called bosons when their spin is zero or 
an integer (in multiples of ñ), and their statistics are described by the Bose-Einstein dis- 
tribution function. Examples of bosons include photons, phonons, and Higgs particles. In 
contrast, if the particles have half-integer spins, they are called fermions, and their statis- 
tics are governed by the Fermi—Dirac distribution function. Examples of fermions include 
electrons, muons, and protons. Both types of distributions play important roles in practice. 
For example, the behavior of electrons in metals and semiconductors at a finite temperature 
depends on the Fermi—Dirac distribution. Similarly, the properties of lasers, spasers, and 
Bose-Einstein condensates are governed by the Bose-Einstein distribution [164, 165]. 


A crucial question that we need to answer is: which equilibrium distribution must we adopt 
for bosons and fermions? As all three equilibrium ensembles require that we know the 
exact number of states in each of the allowed energies, counting states becomes an essential 
consideration. The canonical and the microcanonical ensembles require us to sum the states 
over a fixed number of particles. In practice, counting of states is much more complicated 
for a system with a fixed number of particles than for a system with a fixed chemical 
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potential. This consideration suggests the grand-canonical distribution as a possible choice. 
However, to apply the grand-canonical ensemble, we need to know the chemical potential. 


Application of the grand-canonical ensemble becomes much easier in situations where the 
particles can be treated as being independent with negligible interactions. In this situation, 
the total energy of the system can be written as a sum over energies of the single-particle 
states. This is the case for a collection of strongly degenerate fermions or bosons. In the 
case of fermions, strongly degenerate means that the lowest single-particle states are occu- 
pied with a probability close to unity. In the case of bosons, strongly degenerate means that 
a significant fraction of the particles are in the lowest energy state. 


Fermi-Dirac distribution: Consider a fermion with the eigenstate |k) and energy Ex. 
Owing to the Pauli exclusion principle, the occupation number nx for this state can 
only have values 0 or 1. By adopting the grand-canonical ensemble as the equilibrium 
distribution, the average occupation number (ng) is found to be [164, 165]: 


m) = Tecate) _ Emo k EXP [-(kEk — unk)/keT] 1 
k) = Tr (Pea) ae exp [—(m Ex — “nK)/kpT| "exp [(Ex — 1)/kpT] +1 


(3.22) 
This is known as the Fermi—Dirac distribution and represents the probability that energy 
level Ey is occupied by a fermion. Even though a distribution function describes the number 
of particles occupying a given state, the Fermi—Dirac distribution can be interpreted as a 
probability distribution because at most one fermion is allowed in each state. In practice, 
the chemical potential jz is replaced by the Fermi energy Er. 


Figure 3.2 shows the Fermi—Dirac distribution as a function of the energy ratio (Ex — 
LL)/kpT. The classical Maxwell—Boltzmann distribution in the form exp [—(Ex — u)/kg T] 
is also shown for comparison. A useful observation is that for states with energy slightly 
larger than the Fermi level Ep = u (by a few kgT), we can approximate the Fermi—Dirac 
distribution with the classical distribution. This approximation has proved quite useful for 
describing the behavior of electrons and holes inside semiconductors and the distribution 
of optical phonons in solids at low temperatures. Given that (nx) is a function of energy Ex, 
we can calculate the number of fermions in the energy range [E;, Ey] by using the density 
of states as 


Ef 
Np = [ Dos(Ex) (nx) dE (3.23) 


Bose-Einstein distribution: Consider a boson with the eigenstate |k) and energy Ex. 
Unlike the fermions, the occupation number nk of bosons can take any integer value from 
0 to oo. By adopting the grand-canonical ensemble as the equilibrium distribution, we can 
calculate the average occupation number as [164, 165] 


E Tr (Peqnk) ro k exp [—(nkEk — unk)/kgT] B 1 
EO Te) ee, exp[- 0E un)/keT] exp [(Ex — »)/keT] — 1 


(3.24) 
This function is called the Bose-Einstein distribution; it is also plotted in Figure 3.2 using 
u = O. It was first derived by Bose in 1924 for phonons and generalized later by Einstein 
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Maxwell-Boltzmann 


Bose-Einstein 


0 
(E-Wk, N) 


EEE) Average population distribution of single-particle states for bosons (Bose-Einstein) and fermions (Fermi-Dirac) 
under the grand-canonical ensemble. Both distributions approach the classical distribution (Maxwell—Boltzmann) 
when (Ex — jx) /kgT >> 1. 


for atoms. Just like the Fermi—Dirac distribution, we can approximate the Bose—Einstein 
distribution with the Maxwell—Boltzmann distribution when (Ex, — u) > kpT. A cele- 
brated application of the Bose-Einstein distribution is for blackbody radiation, which can 
be treated as an ideal gas of photons with a variable number of particles. The condition of 
energy minimum gives u = 0, and the energy of each photon is hv for the radiation of 
frequency v. The density of states for photons of frequency v from Section 1.5 is given as 
Dos(v) = 82 v?/c?. Multiplying this density of states with the average energy of photons 
in each state ((nx) hv) yields the Planck’s blackbody radiation formula: 


p= . (3.25) 


3.2 Linear Response Function 
SSS] 


In this section we discuss the linear response theory as formulated by Kubo [166], Mazo 
[167], and others [168]. We consider a system in thermal equilibrium and calculate the 
system’s response to an external stimulus applied at time t = 0 using the grand-canonical 
ensemble of Section 3.1.3. This choice enables us to access many results known in the 
disciplines of systems engineering and control engineering. 
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Table 3.2 Some examples of perturbations to the equilibrium Hamiltonian. 


F(t) B 
electric field coupling electric field, E(t) electric dipole moment, d 
magnetic field coupling magnetic field, B(t) magnetic dipole moment, 4 
electro-optic coupling external voltage, V(t) permittivity tensor, € 


3.2.1 The Kubo Formula 


Consider a system with the total Hamiltonian consisting of two parts, 
Hiotal = Heq + Hex(t), Hex(t) = —Bu(t)F (0), (3.26) 


where He, is the time-independent part in equilibrium and Hex(t) is the time-dependent 
part with the external field F(t). The Heaviside step function u(t) equals O for t < O and 1 
for tf > 0 [169]. It is introduced to ensure that the external influence is turned on at t = 0. 
Here B is a centered operator corresponding to an observable B such that (B),, = 0. This 
is not a real restriction because any operator can be converted to such a form by subtracting 
its ensemble average from it. Even though we have included only one external stimulus in 
the Hamiltonian, the following analysis can be readily extended to multiple stimuli because 
a linear system’s responses to different perturbing stimuli add up independently in view of 
the superposition principle. Some examples of external perturbations are given in Table 3.2. 

We calculate the ensemble averages of the quantities involved using the density operator 
p introduced in Section 2.2.3. Based on the discussion in Section 2.2.5, it satisfies the 
following equation in the interaction picture: 

dp(t) i 


ppt Flea eO] = [MOF OB. pO), (3.27) 


The formal solution of this equation for t > 0 is given by the integral equation 
: t 
i 
P(t) = Peg(t) + =f U(t — t)F(t)[B, p(t)U"(t — t) dr, (3.28) 
0 


where U(t) = exp(— Hed) and peq(t) is the solution of the homogeneous equation 


OPeq 
ot 


i 
T 7 Heq» Peq] =0. (3.29) 


The integral equation can be used to obtain an approximate solution for p(t) through an 
iterative procedure that begins by substituting p(t) = eq(t) on the right side and then 
uses the new solution to obtain the next-order solution. 

Suppose there is another operator A for the system of interest, and we want to know 
how its observed values change as a result of the perturbation applied to the system. As 
the system was in equilibrium before the perturbation, its average value for t < 0 is given 
by (A)eg = TrlApeg]. After t > 0, its ensemble average is given by (A) = Tr[ Ap]. 
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Clearly, the perturbation-induced change, A.A(t) = A(t) — (A) 
whose ensemble average is given by 


eq» 1S a centered operator 


i t 
(AAQ)) = Tr E f AU(t — t)F(t)[B, PeqglUŻ (t — oar]. (3.30) 


This expression can be simplified by noting that the trace operation is invariant under cyclic 
permutation (i.e., Tr[ABC] = Tr[BCA]). The result is given by 


t t 
(AA) = Tr E [ peqlB, A(t — 1)IF C) ar| == f (B, At = 1) oq FC) dr, 
(3.31) 
where we adopted the notation used in Section 2.2.5 and defined A(t) in the interaction 
picture as A(t) = U'(t)AU(1)). We also moved the trace operation inside the integral and 
used the notation ([b, A(t —T)))eq = Tr [ beg lB, A(t — t)]]. 
As Eq. (3.31) is in the form of a convolution, we write it as 


t 
(AAC) = [ XAB(t — T)F(t) dt, (3.32) 


where x g(t) is the retarded linear response function, defined through the Kubo formula: 


1 rs 
xB) = zzu (IB, Aeq» (3.33) 


where u(t) ensures causality of the response function. The Fourier transform of this func- 
tion is known as the generalized susceptibility (or generalized admittance). As required 
for any causal response function, this susceptibility satisfies the Kramers—Kronig relations 
described in Aside 3.2. 


Aside 3.2 Kramers—Kronig Relations 
We write the Fourier transform of x g(t) as 


CO 


FAX AB}(@) = f u(t) x AB(t) expliwt) dt, (3.34) 


where F{...} denotes the Fourier transform operation and u(t) is the step function. Using 
the convolution theorem, we obtain 


1 
F{XAB}(@) = Fiulo) * FixaBho), — F{u}(@) = 8) — z (3.35) 


where we used a known result for the Fourier transform of the Heaviside step function u(t). 
After a few algebraic manipulations, we obtain 


ea Fixas’) ; 
1 w, 
co O-@ 


1 
F{XAB}(@) = mee (3.36) 


where P.V. denotes Cauchy’s principle value of the integral. Separating the real and 
imaginary parts of F{x48}(@), we obtain the Kramers—Kronig relations [170] 


RLF (xB) = “Pv, / © SA Kash yy 


/ 
66 wo — w 


(3.37) 
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1 F MF xa hol aay (3.38) 


SIF {xsBko)] = —— PV. 


/ 
2B a — w 


These two relations show us how the real and imaginary parts of a causal response function 
are related. They are of immense importance in modeling quantum devices. For example, 
if the absorption coefficient (related to the imaginary part of the permittivity) is known for 
a material over the entire frequency range, the refractive index (related to the real part of 
the permittivity) can be calculated from it (and vice versa). 


In practice, it is useful to write the Kramers—Kronig relations using only positive frequen- 
cies. The Fourier transform of a real function f(t) satisfies the relation F{f()}(—@) = 
F*{f()}(w). Using it, we obtain the modified Kramers—Kronig relations (for causal real 
functions) containing positive frequencies: 


+00 Je 1 
NFK EY. f = ae Nig (3.39) 
m 0 Ww” =a 
2 + WRF j 
SiFixagko)] = -PV I A do. (3.40) 
T 0 aw — w 


Let us consider how the Kramers—Kronig relations can be applied to analyze passive 
dielectric materials. For such a material, the permittivity €(w) represents its response to 
an electromagnetic field. However, €(@) does not vanish for large frequencies but has a 
finite value as w — oo. This can be understood by noting that the material cannot respond 
fast enough to very large frequencies. However, one requirement in deriving the Kramers— 
Kronig relations was that the Fourier transform for the linear response function x (t) must 
vanish at infinity. Therefore, we must use the difference €(w) — €o for formulating the 
Kramers—Kronig relations. 


The preceding example shows that causality alone is sufficient to establish the Kramers— 
Kronig relations in a passive dielectric medium. This is not the case for an active dielectric 
medium or a magnetic medium because of the presence of instabilities [171]. The theory 
of complex analytic functions is used in such situations to derive the Kramers—Kronig rela- 
tions. More specifically, if singularities exist in the upper-half complex plane, the contour 
is adjusted such that the integral is taken on a line above the singularities. 


In practice, the integrals appearing in the Kramers—Kronig relations converge slowly. A 
way to improve the convergence is to use the subtractive Kramers—Kronig relations as 
described in Ref. [170]. The idea is to incorporate independent measurements of the 
real part of the permittivity at one or more reference wave numbers to minimize errors 
due to extrapolations of the data. This process can be used to derive multiply-subtractive 
Kramers—Kronig relations if the convergence still remains an issue. 
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3.2.2 Properties of Linear Response Function 


As seen in Eq. (3.32), the response of an observable A for a linear system can be written 
as the convolution integral 


t 


(AA) = f yt- OF )de = yup) * FO. (3.41) 


We emphasize that both A and B are Hermitian operators, as discussed in Section 2.2. 
It is easy to verify that the commutator of two Hermitian operators is anti-Hermitian 
(i.e., [A, B]' = —[A, B]). As a result, the expectation value of the commutator in Eq. (3.33) 
is a purely imaginary number, making x 4g(t) a real-valued function. It is important to note 
that our derivation holds even when the operators are not Hermitian. However, the retarded 
linear response function can take complex values in that situation. 

It is common to call x g(t) the “retarded” linear response function. The reason is that it 
describes the response of the observable A at time f to an impulse applied to the system at 
an earlier time t — t through B. This observation follows from the commutative property 
of the convolution operator in Eq. (3.41). Using F(t) = 6(4) for an impulse, we find 


t 
(AAO) = XABO * 80) = 8) * XABO = [ XAB(T)O(t — tT) dt = xaBlt). (3.42) 


This equation shows clearly the response of a linear system at time t when an impulse is 
applied at time t — T. 

Another important quantity is the linear response function denoted by K p(t). It 
determines the system’s response caused by the perturbation Hex(t) = —BF(t) without 
assuming that F(t) is applied to the system at time ¢ = 0. It is defined similar to x 4,(¢) in 
Eq. (3.33) but without the step function u(t): 


1 ~ 
Kap) = a (B, AO] eq - (3.43) 
This function applies when ¢ varies in the range (— 00, oo). It is easy to show that the two 
response functions are related to each other as 


Kas) = | cae) ane (3.44) 
—XBA(—t) if t<0. 
Here we used the cyclic permutation property of the trace operator for t < 0. 

Frequently, the quantity of interest is not the observable A but its time derivative, 
Å= dA/dt. An example is when one is interested in the current flowing through a device 
but the charge is measured. We can apply the result in Eq. (3.41) by simply replacing A 
with A: 


t 


(AA) = f Xap — DET) dt = x jgh) * FO). (3.45) 


As the trace operation is invariant under a cyclic permutation, it is possible to write x įg(®) 
in two different but useful forms: 


1 ~ 1 a . 
Xig® = a (B, AOI) eq = uw) (=), Al) ea (3.46) 
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Another useful form for x ABO is 
Xig® = mof £ ((B(-1), Al) eg (3.47) 


It requires that ((B(— T), Al) = =0 in the limit t—oo. This condition follows 
from the so-called mixing property, (A(t) B(12)) eq = (A(t1))eq (B(t2)) eg in the limit 
|t1 — T2| —> œ, which essentially says that there is no correlation between the two opera- 
tors when the time difference is very large. Recall that the time derivative of B is obtained 
using inB = [B, Heq]. 

To prove the preceding result, we first note 


d a ; ~ F ni 
T BC), Aeg = — (BCT), Al) eg = — (U, AMeq (3.48) 
where we used the Kubo identity 
d 
Fe [exp(tHeq)B exp(—tHeq)| = exp(t Heq)[Heq, B] exp(—t Heq). (3.49) 
We then use Eq. (3.48) in Eq. (3.47) to show that 
Xap = ul) [s (MB), Al)eg 
= xof (IB, AGM eq dt = i Xag(t) de. (3.50) 
t t 


We can now calculate the Fourier transform of x AB? which is known as the generalized 
conductance: 


1 œo | CO o x 
osso) = Flao = 5 f of BAO dea 
=00 t 


1 lee) ~x T es œ pit _ | 
= af (BAM f e uar= | ——y4g(t)dt. 68.51) 


l@ 


As we shall see later, this expression is useful for applying the fluctuation-dissipation 
theorem to a physical system. 


3.2.3 Generalized Susceptibility 


In practice, the retarded linear response function can be constructed using the eigenstates 
of the Hamiltonian Heq in Eq. (3.26). Let |W) with n = 0, 1,2,... form a complete set of 
the eigenstates of this Hamiltonian, with the corresponding eigenvalues E,,. Here, |) is 
the ground state with energy Eo, |1) is the first excited state with energy E1, and so on. 
The retarded linear response requires us to calculate an ensemble average over a chosen 
equilibrium distribution. It is common to choose the canonical or grand-canonical ensem- 
ble for this purpose. It is also possible to employ the microcanonical ensemble (see the 
derivation in Ref. [172]). However, as this choice does not involve any energy exchange, 
we can only establish a relation between the linear response function and the correlation 
function in terms of the total energy range of the ensemble, rather than its temperature. For 
this reason, we exclude the microcanonical ensemble in the following discussion. 
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Suppose the density operator Peq represents one of the chosen ensembles and the corre- 
sponding partition function is given by Zeg = Tr (Peq). We know that the density operator 
is diagonal in the eigenstates of the equilibrium Hamiltonian, even though it would take 
the form of a symmetric density matrix (Pnm = Pmn) in other representations because of a 
detailed balance principle that is crucial for maintaining the equilibrium distribution within 
the ensemble [173]. Let (Peq)n be the nth diagonal element of peq. Adopting the notation, 
Peq |Yn) = [(Peq)n/Zeq] |n), for the action of the density operator on an eigenstate, we 
write the ensemble average for any operator O as 


1 
(O) eg = Z. > (Wn |O| Gn) (Peqg)n- (3.52) 

eq n 
We use this equation and the completeness relation >, |Wn) (Yn| = 1 in Eq. (3.33) to 
obtain 

1 ~ Ai —iot ton 
Fixas) == fu) ((B, A eq edt = = 3 Lie 
oc m n 


/ UDE [Yml B Yn) (WAO Ym) — (Phl An) (UnIBIYn)] dt. 3.53) 


We can evaluate each term in the square bracket of the integral by invoking the definition 
A(t) = exp( 5 Heqt)A exp(— z Heqt) in the interaction picture. The resulting values are 


(Win|B| Gn) RAKOR, = (Vn|B| Gn) (Phl Al Gin) exp Fa = En > (3.54) 


(Win|A)| Wn) (WnlB| Yn) = (Ym Al Wn) (Wn |B] Yn) exp Fa = En : (3.55) 


To simplify further, we make use of the Sokhotski-Plemelj formula, 


1 
lim = PV.— F imd(o). (3.56) 


e>0+t wt ie w 


Aside 3.3 shows the conditions under which this formula is valid and how it can be used in 
practice. 

To ensure the convergence of Fourier integrals in Eq. (3.53), we introduce a small pos- 
itive parameter e€ through the factor e7% and take the limit € —> O* after performing 
the integration. The result equations are interpreted by invoking the Sokhotski—Plemelj 
formula. With this approach, the generalized susceptibility is found to be 


(Peq)m (Yml B| Yn) (YnlA| Ym) 
Fixagko)= lim TOD Zor p= Eee 


ie Vie D (Peq)m (Prl A Yn) (PrlB Ym) (8.57) 


e—>0+ h Zeq iw — i(Em— En)/ħ+ € 


This equation is known as the Lehmann representation (or spectral representation) of the 
generalized susceptibility. It is one of the most important results in linear response the- 
ory because it shows explicitly how a perturbation on a quantum device couples to its 
equilibrium energy spectrum before the perturbation is applied to the system. 
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Aside 3.3. Sokhotski-Plemelj Formula and Its Uses 
To prove the Sokhotski—Plemelj formula, we choose a smooth function f(@) that is non- 
singular in a neighborhood of w = 0 and consider the identity 


© f@) do =f f = iod © fiw) -f0 IO) io 
oo Wt leE 


oo W + le -œ &@+ie 


Next we take the limit € —> O*. The first integral on the right side can be done using a 
contour in the upper-half complex plane that excludes the origin through a half-circle of 
radius € centered at w = 0. Applying Cauchy’s theorem, the result is 

. © dw 

lim 


e>0t Jog WH i€ 


=-Ti. 


The second integral can also be done using with the result 


lim pe. TU ye im, f Ff) do =Pv. fae, 
læ 


e>0t Jing tie €304 Jigise © + le -œ O 


LO) 


dw — O because it is an odd integral in that limit. The 
lja|>e w+ie 


where we used lim J 
e>0+ 


principle value (P.V.) notation indicates that the integral excludes the contribution at the 
singular point. Putting it together, we have proved the following result: 


ee je pv. f” fO dw inf (0) 


-o W + lE =o 


If we use the identity f(0) = f Do f(@)lw) dw and note that the preceding equations holds 
for any smooth function f (w), we obtain the result in Eq. (3.56). 


As an alternative proof, consider the logarithm function In(z) of a complex number z. It is 
common to define its principal branch using the Ln(z) notation as 


Ln(z) = In(|z|) + i Arg(z), Arg(z) € (=m, +7]. 


Using z = w + ie where wo is real and taking the limit € —> 0*, we obtain 


lim Ln(w + ie) = In(|@|) + izu(o), 
e>0+ 


where u(w) is the Heaviside step function. Differentiating this equation with respect to w 
and taking the limit as € —> 07, we obtain the Sokhotski-Plemelj formula in Eq. (3.56) if 
we use the known relations: 


z =6 Zi = P.V. l 
T = (w), To n(|@|) = P. (5) 


Let us consider an application of the Sokhotski—Plemelj formula. When modeling quantum 
devices, one often has to evaluate a double integral of the form 


I= f tow f el! dt. 
—0o 0 
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This integral does not converge in the usual sense because the integrand e“”’ does not vanish 
as tf —> oo. We can make the integrand vanish at infinity by multiplying it with e~“ 
taking the limit € — 0+. With this modification, the integral becomes 


CO CO CO 
I= f f(@) do T eid; = lim FOP, jy 
—0o 0 


e>0t J_oo E€ — iw 


and 


= zf(0)+ i(PY. L i i dw), 


where we used the Sokhotski—Plemelj formula in Eq. (3.56). This result makes sense only 
when f(q@) approaches zero for large w to ensure the existence of the integral. 


It is also useful to consider how the Fourier transform of the function f(t) = 1/t behaves 
because it does not decay to zero fast enough for large values of t. We just write the results 
because a proper derivation requires the application of Cauchy’s residue theorem with 
custom contours: 


PV. f PEN yy == in sgn(—a), 


—oo 


lim J OD ip A anra, 


e>0tJ-œ ttie 


These results will be useful later when we discuss the properties of certain quantum 
devices. 


3.3 Fluctuation-Dissipation Theorem 
Eee ————— =a 


The fluctuation-dissipation theorem provides a firm theoretical basis for the interaction 
of matter with its environment (surrounding fields) under the linear-response approxima- 
tion. The underlying theory relates spontaneous fluctuations of microscopic variables to 
the kinetic coefficients that are responsible for energy dissipation. 


3.3.1 Dynamic Correlation Function 
We start by introducing two correlation functions, C 4g(t) and S 4g(t), defined as 


sa i ” 
CABO = (ABe, SAB) = z (AMB + BAD) eq - (3.58) 


The function C 4g(t) is called the dynamic correlation function, and it is a complex func- 
tion even for Hermitian operators. The function S 4g(t) is a real function and is known as 
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the symmetric correlation function. By using the cyclic permutation property of the trace, 
we can show that (B.A(f)) eq = Cga(—t) and write it in the form 


1 
SAB(t) = z [Cap(t) + Cpa(—d)]. (3.59) 


This function has real values if both A and G are Hermitian so that Cg 4(—1) = CB (t). 
The linear response function X 4g(t) introduced in Eq. (3.43) is related to the dynamic 
correlation function as 


Kap(t) = : [Cap() — Cga]. (3.60) 


Taking the Fourier transform of this relation, we obtain 


FiKaBio) = = IF {CaBO}(@) — F{CpaO}(-o)), (3.61) 


where we used the relation F{Cgp4(—)}(w) = F{Cpsa()}(—o@). Thus, we only need to 
calculate the Fourier transform of the dynamic correlation function. 

As the evaluation of this Fourier transform requires an ensemble average, we need to 
choose an equilibrium distribution. We choose the canonical ensemble as the relevant dis- 
tribution. We may also choose the grand-canonical ensemble, for which the same results 
are obtained. We do not consider the microcanonical ensemble because it does not allow 
us to relate the temperature of the system to the derived quantities (see Ref. [172]). The 
use of the canonical equilibrium distribution provides us with peg = exp(—Heg/kpT)/Zeq. 
We use this form to obtain 


Cagli) = Tr [Pe AWB] = Tr [peq exp(iHegt/h)A exp(—iHeqt/h)B] 
= Tr [exp(iHegt’ /h)A exp(—iHeg! /h) peqB] = Tr [peqBAW)] (3.62) 


where we have introduced r’ = t+ iħ/kgT as a complex variable. Even though this concept 
has been used in many fields, including general relativity and quantum mechanics [174], 
no physical meaning can be assigned to f’. 

We use the preceding expression to calculate the Fourier transform of C4p(t). The 
result can be simplified as follows by using invariance of the trace operation under cyclic 
permutations: 


F{CapO\Mo) = F {(BAG + ih/keT)) eq} (0) = F {(B(1 — ih/keT)A) 4} () 
= exp(—hw/kgT)F {Cp a()}(—o@). (3.63) 


The last result is known as the detailed-balance condition for the correlation function 
C ag(t). If we substitute this result in the symmetric correlation function S 4,(t) defined in 
Eq. (3.58), we obtain 


F{SABO}(@) = FiS AO) > Sap = SBA’). (3.64) 


The detailed-balanced condition is a generic property of all systems in thermodynamic 
equilibrium. 
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We use Eq. (3.63) to obtain the transform of the linear response function. Noting that 
F {Cp a(—p)}(o) = F{Cg.4()}(—o), we first obtain 


1 
FASABOV) = 5 [1 + exp(hw/kgT)] F{CapO}(o). (3.65) 


Then, using Eq. (3.61), we obtain the following two important relations: 
i 
FAK AB}(@) = z [1 — exp(hw/keT)| F{CaBO}), (3.66) 
2i | 1 — exp(iw/kgT) 
F{K sB}@) = 
h [1 + exp(hw/kgT)| 


FAS ABO} (@). (3.67) 


We obtain the Fourier transform of K g(t) using the generalized susceptibility of the 
retarded linear response function [see Aside 3.2 and Eq. (3.44)] 


0 
xBA(—te dt 
CO 


F{KAB}(@) = / Kage dt = i xag! dt — f 


= Fixas ow) — Fixgras® o). (3.68) 


This expression can be further simplified when both A and G are Hermitian operators. 
Given that the commutator of two Hermitian operators is anti-Hermitian, the expecta- 
tion value of the commutator is a purely imaginary number, and x 4,(f) is a real-valued 
function. Thus, F{x 4p()}(—@) = F{x,~p()}(@))*. Using this relation in Eq. (3.68), we 
obtain 


F{K aByV(@) = 213 [F{xsBO}@)]. (3.69) 
Equating the right-hand sides of Eqs. (3.66) and Eq. (3.69) gives us the well-known 
fluctuation-dissipation theorem for both of the correlation functions [175]: 
2h 
[1 E exp(hw/kgT)| 
h[1 + exp(hw/kpT)| 
[1 — exp(hw/kpT)| 


It is interesting to consider the classical limit in which hw « kgT. By approximating 
exp(iw/kgT) with 1 + (hw/kgT), the preceding two relations take the forms 


F{CABO}(@) = SIF {xaBO}o)], (3.70) 


FS ABO}(@) = SIF {xag o). (3.71) 


iwF {CAB Ow) = —keT x 213 [F {xzBO}@)], (3.72) 
lwF {SAB()}(@) = —kel x 213 [F {x sBO})]. (3.73) 


This result is expected because all operators commute with each other in the classical limit 
and lead to Cag(t) = Sag(t). Using Eq. (3.69) and taking the inverse Fourier transform, 
we obtain the following time-domain result valid for t > 0: 

ð 

3p ABM = —kgTK yp(t). (3.74) 


This result represents the classical fluctuation-dissipation theorem. 
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The quantum fluctuation-dissipation theorem in Eq. (3.70) shows explicitly the close 
association between quantum fluctuations of a system, as described by the correlation func- 
tions, and the linear response of that system as governed by the generalized susceptibility 
F(x Ap(t)}(@). Aside 3.4 explains why these two quantities are identified as the “fluctua- 
tion” and “dissipation” terms by considering a simple system. We stress that, even though 
only the imaginary part of the generalized susceptibility appears explicitly in Eq. (3.70), 
the real and imaginary parts are related to each other through the Kramers—Kronig rela- 
tions. The most general situation occurs when both A and BG are not observable operators. 
The fluctuation-dissipation theorem in this case is obtained with the substitution 


1 
S [Fixas Oo) = 5 Fixas Oo) — Fixa o]. (3.75) 


Aside 3.4 Understanding the Terminology behind the Fluctuation-Dissipation 
Theorem 

To understand the terminology associated with the fluctuation-dissipation theorem, it is 
instructive to discuss the case of a periodic external perturbation using [see Eq. (3.26)] 


Hex(t) = —BF(t) = —[B'F, exp(—iot — et) + BF* exp(iot — et)] 


The infinitesimally small, positive parameter € spreads the linear system’s resonance over 
a finite range of frequencies, thus ensuring a periodic response with a finite amplitude. Its 
function is similar to that of a damping constant in a driven harmonic oscillator. Recall- 
ing that the Hamiltonian of a system represents its energy, the average energy at any 
instant is given by Tr (OIH, eq + Hex(t)}). The energy dissipation rate of the system can 
be calculated using the Hellmann—Feynman theorem (see Aside 3.5), 


d 
PEO) = a [p(t)\(Heq + Hex)] 


=Tr rf el) ea + Hex) +Tr pog Hao; (3.76) 


It follows from the Liouville equation that the first term vanishes. Using the preceding form 
of Hext(t), the second term can written as 


Tr [o THe | = (iw+ €) Trl p(B" Fy, exp(—iwt — et) 


— (iw — €) Trl p(B Fs exp(iwt — et). 


The average dissipated power Pg(t) is calculated by integrating Eq. (3.76) over one cycle 
of duration T in the limit € — 0: 


— 1 fT : 
Prt) = lim 7 Í (iw + €) (Bi (t)) eq Fo exp(—iwt — et) dt 
Ee 0 


T 
lim : (iw — €) (Be F * expliwt — et) dt. 
0 


e>0 
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The linear response theory can be used to show that 


(BO) eq = Fixppt }(@)Fw exp(—iot — et), 
(BO) eq = Fixgrp(—o) Fé, exp(iot — et). 


Using these results and noting that F{xpp+}*(@) = F{xgig}(—o), we obtain 
Pe) = —203[F {xgp1}()||Fol- (3.77) 


This equation shows that, on average, the power dissipated by a system is equal to the 
power received by it. 


The fluctuation-dissipation theorem is a powerful tool, both in classical and quantum 
mechanics, for predicting the behavior of systems that obey the detailed-balance condition. 
The theorem relies on the assumption that the response of a system in thermodynamic 
equilibrium to a small stimulus is the same as its response to a spontaneous fluctuation. 
A powerful consequence of this result is that it connects the relaxation of a linear system 
from a prepared nonequilibrium state to its quantum fluctuations occurring when the same 
system is in equilibrium. 


Aside 3.5 Hellmann-Feynman Theorem 


This theorem states that if |W) is an energy eigenstate of a system with the eigenvalue EF, 
then for any continuous parameter à on which the Hamiltonian of the system depends, we 
have the relation 


2E (wy) = (YW) 6.78) 
an oer) or 
When the eigenstate is normalized ((Y|W) = 1), this theorem can be rephrased as the 


derivative of the energy with respect to a continuous parameter à equals the expecta- 
tion value of the derivative of the Hamiltonian with respect to that same parameter. Even 
though the theorem is attributed to Hellmann [176] and Feynman [177], it was proven 
independently by others including Giittinger [178] and Pauli [179]. 


The derivation starts from the eigenvalue equation H |W) = E |Y}, which provides us with 
the identity (V|H|VY) = E (W|W). Differentiating this identity with respect to A, we obtain 
2E (UY) +E (Sa) EHS) = (Sy + I) + HU). 6.79) 
an an E ar aA 
We can simplify this equation by noting that (W’|H|W) = E(W’|W), where Y’ = 0W/dd. 
Also, the Hermitian nature of the Hamiltonian implies (W’|H|W) = (|H|W’). Using these 
two relations, we obtain the relation given in Eq. (3.78). It turns out that the theorem holds 
even for approximate energy eigenstates, making it quite useful for numerical analysis. 
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3.3.2 Johnson—Nyquist Noise 


In 1928, Johnson discovered [180] and Nyquist explained [181] a source of noise in elec- 
trical conductors, now called the Johnson—Nyquist noise. It results from thermal motion 
of electrons inside an electrical conductor in equilibrium at some finite temperature and 
occurs regardless of any applied voltage. In 1951, Callen and Welton proved the Johnson- 
Nyquist result using a formulation based on the fluctuation-dissipation theorem [175]. 
Their analysis exploits the analogy of a physical system to an electrical circuit and is widely 
used in physics and engineering to describe the operation of lasers and transistors. It is also 
of utmost importance to characterize noise in quantum devices. 

The fluctuation-dissipation theorem, when written in equivalent electrical terms, relates 
thermal fluctuations in the current or voltage response of a system to its linear response 
quantified by the admittance or impedance (through the Thevenin or Norton theorem for 
equivalent circuits). To understand this, consider a passive two-terminal electrical circuit 
containing linear components (resistors, capacitors, air-core inductors, etc.) and assume 
that the circuit is in thermal equilibrium with a reservoir at temperature T. The two termi- 
nals can be kept open, as shown in Figure 3.3. The dimensions of the circuit are assumed to 
be small compared to the wavelength (c/q@) so that the retardation effects can be neglected 
for quantities observed at its terminals. We connect an external battery to this circuit so 
that a current Z(t) flows through it. Using its Fourier transform Zg(w), the Ohm’s law gives 
us the relation 


Telo) = Ex(w)/ZE(), (3.80) 


where Zg(œ) is the electrical impedance of the circuit. 

To relate the current to our linear response theory, we need to map the operator B 
to Zg. This mapping requires some thought about the function F(t) that appears in the 
system Hamiltonian. Noting that average energy of the system at any instant is given by 
Tr (o(t)[Heg — BF O), the energy-dissipation rate of the system is given by 


d 
Pe) = i Tr [oHe — BFO)] 


d dF 
= IF (SO tte = BFO) — T0 Tr [eB] . (3.81) 


A passive electrical circuit with impedance Z, in equilibrium with a thermal bath at temperature 7. An electromotive 
force Eg is connected to the terminals and a current Ze flows through the element with a voltage drop Ve across it. 
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As before, the first term vanishes. Noting that Tr[o()B] = (B0) 
with Zg and get the relation 


eq) We can identify B 


Pg = = (LE) eq - (3.82) 
As the rate of energy dissipation for a circuit is given by (Ez(t)Zz).,, we must relate the 
derivative dF /dt to the applied voltage Eg(t) through the relation 
dF(t) 
gy TEO > toFtP Mo) = F{EsKo), (3.83) 


where we used the Fourier transform to replace the time derivative with —iw. 
We can now use the retarded linear transfer function, ¥{x7,7,}(@), to obtain the relation 


FX (LE) eq }(@) = iw 


F = = . 3.84 
{XTpTp}(@) FIFO) Zw) (3.84) 
The fluctuation-dissipation theorem requires the imaginary part of this quantity: 
iw art{Z(a} 
SIF =9 = : 3.85 
5 [Fi xt, }(@)] | ZOP (3.85) 
Using this result in Eq. (3.70), we obtain the relations 
2hw R{Z(@)} 
FAC t — : 3.86 
{ Trt; Kow) [i — exp(hia/keT)| Zlo ( ) 
ha |1 + exp(ħw/kBT)| R{Z(w)} 
F{ST,Tp(O}(@) = l ] (3.87) 


[1 —exp(iw/keT)]  |Z(w)|2 ` 


In electrical literature, it is common to use the voltage Vg measured across the circuit 
shown in Figure 3.3. From the Ohm’s law, we know that F{Vge}(@) = F {Ze}(@)Z(@). The 
corresponding voltage fluctuations are thus given by 


Fa Meo) = -AZo Bis 
ues ~ fl = exp(ho/keT)] 
B l [1 + exp(iw/keT)] 
FSv Vr Ow) = —han{Z(@)} [=oD] (3.89) 


These results provide the spectral density of thermal noise at all frequencies. In the classical 
limit, we can approximate exp(iw/kgT) with 1+ha/(kgT). Itis common to quote the noise 
spectral density using only positive frequencies, which requires multiplying the result with 
a factor of two. With these modifications, we obtain the well-known formula 


FACV eve) = 4KgT R{Z()}. (3.90) 


Figure 3.4 shows the equivalent circuit representation of this result using a noise source 
with a bandwidth Af = Aw/(2z7). 
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3.4 Dielectric Function 
SSS SS SS SS SS SS 


The dielectric function describes the reaction of a system to an external electromagnetic 
perturbation. As long as this reaction is linear, the response can be obtained by invoking 
the linear response theory of Section 3.3. As the linear response of a system is independent 
of the external perturbation, the dielectric function is a property of the system itself. In this 
section, we consider the electric susceptibility, which describes the polarization resulting 
from an incident electric field. 


3.4.1 Calculation Using Linear Response Theory 


The Gauss theorem provides a way to relate the charge density of a material to the scalar 
potential. As this potential and the electric field are unambiguously related to each other 
once the gauge is fixed, the dielectric function is most conveniently calculated by consid- 
ering the charges and potentials involved. We employ the Coulomb gauge and account for 
the influence of the external field through the interaction Hamiltonian 


Hex(t) = I Pe(¥)Pexi(¥, t) dr, (3.91) 


where the charge density pe(r) acts as the operator B in Eq. (3.26) and the scalar potential 
ex (Fr, t) plays the role of F(t). In the quasi-static approximation, ext is related to the exter- 
nal electric field through Eext = —V Qext. The charge density pe created by the perturbation 
generates the potential ¢ through the Poisson equation (see Section 2.3): 


1 
V(r, t) = —— pelr, 1). (3.92) 
€0 
This equation can be solved using the Green’s function to obtain 


1 pelt’, t) dr’. 
4T Eg Ir- r'| 


lr, t) = 


(3.93) 


We recall that (pe(r, 0))eq = 0 as the medium is assumed to have no free charges before 
the perturbation is turned on. After the perturbation is applied, we can find (pe(r), t)e, by 


Equivalent circuit for thermal noise generated by a passive electrical circuit that is in thermal equilibrium at 
temperature T. A noise source on the left generates the current Zg and the voltage drop Vg across an element with 
the impedance Z. 
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invoking the linear response theory of Section 3.3 (with pe acting as both A and B) and 
write the ensemble-averaged charge density as 


(Pel, t))eq = — Tf Xpepe (©, t; r1, t1)$ext(T1, t1) dtı dry, (3.94) 


where Xp,o,(¥, t; r1, ¢1) is the polarizability. As a charge density at one point creates a poten- 
tial at another point, as indicated in Eq. (3.93), the total potential ¢(r, t) can be written as 
the sum of the incident induced potentials: 


600.) = dente |SS Me EEG, ryt). dt dri de. (3.95) 


4reo|r — r2| 


We can rearrange this expression to show the input-output relationship explicitly as 


se)= ff EG rps — t) pee an] 
4reo|r — r2| 


x Pext(¥1, t1) dtı dr}. (3.96) 


The dielectric function e(r, t, r1, t) is defined to relate the field variables E and D as 


Kcr, ft) = I e~! (r, t; r1, t1)D(r:, t1) dt; dry. (3.97) 
We can convert it to a relation between the two potentials by using 
Er, t) = —Vọlr, t), Dlr, t) = —Veexu(r, t). (3.98) 
It is easy to show that the two potentials are also related as 
or, 1) = If eT t; r1, t1)$ex(t1, t1) dti dri. (3.99) 


A direct comparison of this equation with Eq. (3.96) provides us with the important result: 


Xpepe©2: É; r1, t1) 
4reo|r — r2| 


el (r, tri, 1) = ô — rit — t) f dro. (3.100) 

If we assume that the medium has translational invariance in both space and time, its 
polarizability depends only on the differences r2 — rı and t — tı. The dielectric function 
for such a medium takes the form: 


Xpepe 2 — T1,t— t1) 
4r eo|r = r2| 


er —r,,t—t)) = ôr — ri)(t — ti) / dr, (3.101) 


As the last term is in the form of a convolution in both space and time, we can write e7! in 


the Fourier domain in the following compact form: 


_ FXpepeMK,o) 


=i NE 
Fle \(k,@) = 1 aL 


, (3.102) 
where the four-dimensional Fourier transform of a function G(r, t) is defined as 


FIG\(k, w) = f I G(r, ) exp(ik - r — iwt) dr dt. (3.103) 
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3.4.2 Relation to the Frequency-Dependent Conductivity 


In practice, it is useful to write the dielectric function in terms of the material’s conductivity 
that can be measured using a variety of techniques. Ohm’s law gives us the relation 


F{J}(k, w) = F {0 (k, w) F{E}(k, w) = —ikF {0 (k, wF {p Xk, w). (3.104) 


We can eliminate F{J}(k, w) using the continuity equation, 3pe/3t + V - J = 0, which, in 
the Fourier domain, takes the form 


k - F{J}(k, w) = oF {pe}(K, ©) = —@F {Xp.0. (kK, ©) F {Perr} (k, ©), (3.105) 


where we used Eq. (3.94) in the Fourier domain. Combining the two preceding equations, 
we obtain 


K Filo}(k,o) 
FX pepe }(K, ©) = Oy a: sl) (3.106) 
1+ oar an 
Using this form in Eq. (3.102), the dielectric function can be written in terms of the 
conductivity as 
Flok, o) 
i : 


Egw 


Fie}(k,w) =1+ (3.107) 


It is interesting to note that the frequency-dependent conductivity o (k, w)), which obeys the 
causality requirement, is proportional to F{e~!} — 1. This is the reason that the Kramers- 
Kronig relations must be applied to F{e—!} — 1, and not to the dielectric function itself. 
The form of the dielectric function in Eq. (3.107) can be simplified further for materials 
whose linear response is spatially local (i.e., the response at any point depends only on the 
perturbation near that point; for details, see Refs. [182, 183]). In the Fourier domain, this 
amounts to setting k = 0. It is common to drop the k dependence and write Eq. (3.107) as 
Foo) 
E0@ i 


Fle\(o) =1+i (3.108) 


This form is valid when the wavelength inside the material is significantly longer than other 

characteristic lengths such as the unit-cell size or the mean free path of electrons [184]. 
Equation (3.107) implicitly separates the charges into bound and free charges (via def- 
initions of E, and D). At low frequencies, F{e}(w) is used for describing the response 
of bound charges to a driving field, leading to an electric polarization, while F{o}(@) 
describes the contribution of free charges to the current flow. Even though such a clear sep- 
aration is possible in certain spectral regions, the distinction between the bound and free 
charges is blurred at high frequencies, particularly at optical wavelengths. For example, in 
the case of highly doped semiconductors, the response of bound valence electrons can be 
lumped into a static dielectric constant €,g, while the response of conduction electrons is 

lumped into the conductivity F{o}(w), leading to a dielectric function of the form [28] 
Fiel) = Esa + paeka (3.109) 

E0w 

It is possible to rearrange the two terms without affecting the function value by replac- 
ing €sq with | and subtracting iegw(€sq — 1) from F {o æ), as suggested in Ref. [185]. 


111 


3.4 Dielectric Function 


We have chosen not to make such a change and write the dielectric function in its widely 
used form. 

The preceding discussion shows that, in general, F{e}(@), and F{o}(w) are complex 
functions that are often determined through experimental measurements. At optical fre- 
quencies, it is common to employ the concept of a complex refractive index using the 
definition, 


now) = JF {e}(@) = nr(@) + inw), (3.110) 


where the real part np is called the refractive index and the imaginary part nz is related to 
the loss of electromagnetic power through absorption inside the medium. 


3.4.3 Models Used for the Permittivity 


Many models have been developed to find an analytic form F{¢}(w) for different media. 
For example, a free-electron-gas model is often used for metals to calculate the response of 
electrons to a time-varying electric field. This model, known as the Drude model, provides 
the following expression [185]: 


2 
€0@;, 


F {eprude}(@) = €9 — (3.111) 


wo +iy(@)o’ 
where wp = \/g2No/(meeo) is the plasma frequency introduced in Section 1.3.1 (No being 
the density of electrons) and y is the collision rate of the free-electron gas [28]. This 
formula is valid if the smallest dimension of a metallic object is larger than the electron’s 
mean free path le (defined in Section 1.2). If that is not the case, y needs to be replaced by 
Yn defined as [186]: 


Yn(@) = y (w) + 2g1(vF/D) (3.112) 
where D is the enclosing diameter of the nanoscale object and 


Ef 


s= hbarw 


1 
f L+ K) dx. (3.113) 
l-k 


The Fermi velocity vr and the Fermi energy Er have been defined in Section 1.2. The 
confinement of electrons to a nanoscale object provides the second contribution to yn. 

To the lowest order, the linear response of an electron gas is governed by the plasma 
frequency that depends on the density No of the electrons (see Section 1.3.1). Notice that 
this frequency does not depend on the size of the object that contains the electron gas [33]. 
At low frequencies such that w < y, absorption is so large inside metals that any external 
electrical field decays from the metal surface as e~*/*, where the skin depth 5 is defined 


as [28] 
TERA (3.114) 
Wp w 


Owing to the frequency dependency of y(œw), special care has to be taken to find the 
conductivity in the limit w — 0 (see Ref. [187]). 
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The Drude model involves two time scales. The time scale for charge separation within 
the electron gas is governed by 1/q,; it restores the quasi-neutral behavior of the electron 
gas when subjected to disturbances. The second time scale is the relaxation time, defined 
as T = 1/y (see Ref. [188, 189, 190]). It describes the relaxation of excited electrons, 
and its magnitude depends on the frequency of the electromagnetic wave as y (w) = yo + 
ba”, where yo and b are two constants [191, 192]. The electron-phonon interaction is 
responsible for yo, while Umklapp scattering provides the constant b. 

The treatment of a metal as a free-electron gas entirely ignores the fact that the motion 
of electrons is also affected by the metal’s nuclei within the crystal lattice and other elec- 
trons bound to these nuclei. Their influence can be taken into account by introducing a 
background dielectric constant ep that is real, a reasonable approach in most cases of prac- 
tical interest. This quantity can be calculated by considering the interband transitions from 
the fully occupied bands below the Fermi energy to the half-filled conduction band. The 
Drude’s dielectric function given in Eq. (3.111) is then modified as 

E0w 


p 
n a 3.115 
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F {e}(@) = elw) — 

Even though the Drude model is a useful tool that can describe many significant fea- 
tures of a metal’s permittivity, especially if we include the intraband contributions, it fails 
on several counts. First, because the model relies on the conduction-band electrons, any 
process that may excite electrons in the bands of lower energies (usually a d-band) can 
introduce significant deviations from its predictions. This is why the Drude model pro- 
vides better predictions in the infrared region compared to the visible region. Second, the 
assumption that the conduction band is parabolic (through the effective-mass approxima- 
tion) is a severe limitation for representing real metals. Third, the Drude model ignores 
the energy dependence of y, even though significant energy dependence of this parameter 
is expected from a Boltzmann-type analysis. Fourth, a severe limitation results from the 
local-response assumption that completely ignores the wave-vector dependence of the per- 
mittivity. Such an assumption is valid only when the electric field varies so slowly that it is 
safe to discard spatial dispersion. 

In reality, the permittivity has two length scales corresponding to two different phe- 
nomena. The first effect is the screening of the field at an interface of metals that can be 
described by classical electrostatic theory. The Drude model assumes a negligible screen- 
ing length. However, at the atomic scale the screening process requires a certain distance 
to take place. The screening length is also known as the Thomas—Fermi length for metals 
[185]. The second length scale is associated with the nonlocal response of the metal. It 
corresponds to the distance traveled by an electron of velocity vr over the duration of an 
optical cycle. When this length is much smaller than the wavelength of the external field, 
the optical properties are not affected much. This can occur when vr/w K Aandw/y > 1. 
In contrast, non-local corrections are required when vF /wp > 4/27. Also, when the elec- 
tron velocity vr equals the phase velocity (vp = wA/2z:) of an electromagnetic field, the 
so-called Landau damping becomes important [193]. 

The Drude model is not suitable for modeling dielectrics or semiconductors. Reason- 
ably accurate permittivity models that can describe dielectrics or semiconductors include 
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the Sellmeier and Cauchy models for transparent materials with no conduction electrons 
[194], models for semiconductors based on the harmonic-oscillator approximation, the 
Lorentz model for dielectrics [195], and the Tauc—Lorentz model [196]. The Lorentz 
model assumes that every electron is bound to a positively charged atomic nucleus by a 
harmonic potential and that it exhibits damped harmonic oscillations under the influence 
of an external electric field. This model can be used to describe dielectrics that exhibit 
momentum-preserving intraband transitions. It is similar in form to the Drude model except 
for the addition of a resonance frequency wo describing the oscillations of the bound 
electrons: 


F {€Lorentz}(@) = €p(@) + >——. (3.116) 
w — iwy 


7 
The Lorentz model is widely used for dielectric materials. 

In recent years, new types of lasers, known as spasers because of the role played by sur- 
face plasmons, have been developed. They are based on 2D semimetals such as graphene 
and 2D semiconductors such as MoS⁄ [197, 198, 199]. It is worth considering how the elec- 
trical behavior of these materials differs from the Drude or Lorentz-type models (see Refs. 
[200, 201]). Here we focus on the graphene, although a similar analysis can be adopted 
for other 2D materials. The main difference would be that some 2D materials have finite 
bandgaps (e.g., transition-metal dichalcogenides, hexagonal boron nitride, and silicene). 

Experimental and theoretical results show that the permittivity of graphene is given by 


i 
F {€graphene}(@) = €o + gl rhene(w), (3.117) 


where d is the thickness of graphene and Ographene(@) is its conductivity. Early cal- 
culations of the graphene’s conductivity were carried out using the Dirac Hamiltonian 
(202, 203, 204]. Most studies considered the effects of disorder in a phenomenolog- 
ical manner, but they were included self-consistently by Peres et al. [205]. Further 
improvements have incorporated electron—electron interactions [206, 207], a finite chem- 
ical potential, spatial dispersion [208, 209], a graphene bilayer [210, 211, 212, 213], the 
use of the Boltzmann distribution [214, 215], and the effects of temperature [216]. Based 
on these advances, the conductivity of graphene is found to be given by [216, 217, 218] 


2 ho + 2E ha — 2E 
F {O graphene }(@) = 8h [tann (“iS ) = leah (=E) 
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where y is the intraband scattering rate and Ep is the Fermi energy relative to the Dirac 
point obtained from N; = (Ep/ħvf)?/n. The first two terms in Eq. (3.118) result from 
interband transitions, and the last term is due to intraband transitions. Owing to a strong 
dependence of the conductivity of graphene on the Fermi energy Ep, the permittivity of 
graphene can be easily controlled through electrostatic gating or chemical doping, both of 
which have been exploited in practice. 

Consider the high-doping limit of the preceding equation. It is easy to see that graphene 
becomes lossy when w > 2Er (owing to the dominance of interband transitions), or when 


) , (3.118) 
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hw < hy so that intraband free-carrier absorption dominates. If Er >> kgT, Eq. (3.118) 
reduces to the standard Drude model because 


Er 


ae (3.119) 


io . 
F {Ographene}(@) = oa with oo = 
The parameter oo is often used as a fitting parameter [219, 220]. For higher plasmon ener- 
gies, we have to supplement this expression with an additional term that takes into account 
the interband contributions (for Er >> kgT), resulting in 


how + 2Er 
how — 2Er 
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F {Ographene}(@) © 


i (3.120) 


where u(x) is the Heaviside step function [169]. The logarithmic singularity in this 
expression at fiw = Ep disappears at finite temperatures, as seen in Eq. (3.118). 


3.5 Sum Rules 


The sum rules of the retarded linear response function (see Section 3.3) are the relations in 
the form of an integral that any generalized susceptibility must satisfy. They stipulate that 
the Fourier transform F{x4,}(@) of this function must be related to an equilibrium cor- 
relation function of certain derivatives of the operators A(t) and B(t) at the same moment. 
In practice, the sum rules are important to ensure the validity of any phenomenological 
model adopted for calculating the generalized susceptibility, especially at short wave- 
lengths. When applied to the dielectric function of a medium (which is derived from a 
generalized susceptibility), the sum rules can be viewed as universal constraints on the 
frequency-domain results of the dielectric function of a medium. The existence of the inte- 
grals associated with the sum rules can be established using the superconvergence theorem 
given in Aside 3.6. 

Our starting point is the fluctuation-dissipation theorem given in Eq. (3.69). Converting 
it to the time domain and using Eq. (3.43), this equation can be written as 

i 


= / S [Fixa KO exp ion do = = (AW, Bl) eq: (3.121) 


If we substitute t = 0 in this equation, we obtain the first sum rule 


h Cc 
zj S[F{xAB}(@)] do = ([A, B])eg; (3.122) 
—CO 
where we used the relation A(t) = U*(t)AU(#) with U(t) = exp(— iHe). 
We can find other sum rules by taking the derivatives on both sides of Eq. (3.121) and 
using the relation 


ds eee 
ra = ($) w" AC), (3.123) 
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where the operator W is defined as WAH) = [Heg, AO]. Taking the nth derivative, we 
obtain 


h o -\ ntl EN 
=f (i@)"S [F{x aB}(@)] exp(—iot) do = (;) (WAO, Beg: 68-124 


When this expression is evaluated at t = 0, we get multiple sum rules for different values 
of the integer n in the form 


h OO 
=f (ho)"S [F{xaB}(@)] dw = ([W"A, B] eq - (3.125) 


It is instructive to look at the simplest case for the generalized susceptibility associated 
with the operator B and given by F{xgp}(@). Using A = B in the preceding equation, we 
obtain 


h CO 
=f (ho)"S [FixgBo)] do = (DW"B, B)) eq. (3.126) 


Noting that 3 [F {x.48}(@)] is an odd function of frequency w, the preceding integral van- 
ishes for even values of n. Restricting to the odd powers by substituting n = 2p + 1, we 
get the relation 


= | (hays [F{xpp}(@)] do = (WPB, B]e; p=0,1,.... (3.127) 


The sum rule for p = 0 is known as the f-sum rule and is widely used for the complex 
refractive index of a medium. It is especially useful for checking the self-consistency of 
experimental or model-generated data. Also, as the dielectric function depends on atomic 
transitions, the sum rules can yield information about such transitions. It is interesting 
to note that the f-sum rule is analogous to the Thomas—Reiche—Kuhn sum rule used in 
quantum mechanics [221]. 


Aside 3.6 Superconvergence Theorem 


The superconvergence theorem is central to the construction of sum rules. Here we just 
state this theorem without proof, as it is not hard to ascertain its validity by inspection. 
However, a rigorous proof requires an intricate analysis based on complex function theory. 
The theory is formulated for a function f(x) that is continuously differentiable and asymp- 
totically behaves as f(x) = O[(xIn x)~!]. This function is used to define a new function as 
80) = P.V. Jo [f@)/6? — x*)]) dx. Then, the following asymptotic result holds for y >> x 
[170]: 
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We can find the f-sum rule for the complex refractive index n(w) using the Kramers— 
Kronig relations given in Eq. (3.110): 


oo a'na") , 


2 
nr(w) -—1 = — ev. f 
T 0 


l — m 
, 1 a — (3.128) 
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The f-sum rule depends on the asymptotic behavior of n(w) obtained from the Lorentz 
model in Eq. (3.116) using €p = 1 and n = €/€q. In the limit w —> œ, n(w) behaves as 


_ 1 /@p\? 
n(w) = 1— 5 (—) coer (3.129) 


where wp is the plasma frequency defined in Eq. (3.131). If this result is applied to the first 
Kramers—Kronig relation in Eq. (3.128), we get the f-sum rule: 


f ænı(w) dw = Ter (3.130) 


3.6 Surface Plasmons 
—_SSSS_ Sg gg 


Surface plasmons have attracted considerable attention in recent years in the context of 
spasers [184]. The properties of dielectric functions can be exploited to generate surface 
plasmons on the surface of a metal, both in passive and active structures. To describe their 
properties, we need to first understand how the electrons in a metal interact with an external 
electromagnetic field. A rigorous analysis normally impedes the insight gained from an 
approximate, back-of-the-envelope type analysis. It is worth carrying out such an analysis 
first to identify the relevant length, time, and energy scales. 

A major simplification is realized by adopting the Born—Oppenheimer approximation 
[222] because it separates the vibrational dynamics of molecules from their electronic 
response. This approximation is based on the intuition that the motion of heavy nuclei 
can be separated from that of loosely bound electrons (mostly valence electrons) because 
electrons, being much lighter, can follow the motion of the nuclei almost instantaneously. 
As electrons move through a solid, they experience Coulomb forces from other electrons 
and atomic cores scattered throughout the solid. The net result is that the valence electrons 
experience a time-dependent potential, whose inclusion adds significant complexity and 
hinders our ability to build a simplified picture of the electron’s response. The problem can 
be simplified by considering a single electron in an effective periodic but time-independent 
potential, produced by the stationary nuclei and other electrons in their equilibrium posi- 
tions. It is common to neglect electron—electron interactions that cannot be represented as 
a local potential for the single electron under consideration (such as those arising from the 
exchange of two electrons). In this viewpoint, free electrons behave as a gas and move in a 
fixed distribution of positive charges, ensuring the electrical neutrality of the system. This 
model is known as the jellium model [33]. 
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When subjected to an external electromagnetic field, the dynamics of the electron gas 
in the jellium model is characterized by collective oscillations of the electron gas. To the 
lowest order, the linear response of the electron gas is governed by the frequency of these 
oscillations. This frequency, known as the plasma frequency, has appeared in the Drude 
model discussed in Section 3.5.3. It depends on the density Ne of electrons as 


op = Y Ne/(mec0), 6.131) 


but it does not depend on the size of the object that contains the electron gas [33]. The 
plasma frequency marks the frequency beyond which a metal can no longer screen elec- 
tric fields. The oscillations arise from the appearance of a restoring Coulomb force when 
electrons are displaced from their charge-neutral configuration (thus creating a net positive 
charge). Owing to their inertia, electrons do not replenish the positive region, but travel 
further away, thus creating an excess positive charge. This effect gives rise to coherent 
oscillations at the plasma frequency. The coherence of this collective motion is progres- 
sively destroyed by the Landau damping and by collisions of an electron with phonons 
or other electrons [34, 35, 36]. The Drude model discussed in Section 3.5.3 includes this 
damping through the parameter y. 

An intriguing feature, common to all metals that can be described by the Drude model, 
is the monotonous decrease of the real part of the permittivity. For example, R[e(w)/€o] for 
noble metals has a small positive value in the ultraviolet region, but it takes large negative 
values in the infrared region. A negative value of this quantity is essential to sustain the 
localized surface plasmons in nanostructures, and this is the main reason why metals are 
considered an essential ingredient for plasmonics. Indeed, many plasmonic applications 
require the range —1 > ‘[e(w)/eo] > —20 in practice. To recognize the importance of 
the imaginary part of the permittivity, 3[e(@)/eo], we need to consider the quality factor 
of localized plasmonic oscillations [223]. This quantity is responsible for losses inside the 
material. However, the real and imaginary parts are related through a quality factor Q that 
provides a figure of merit of the plasmons and is defined as 


w d 
[Sew] dw 


As discussed in Ref. [223], near the resonance frequency of a localized plasmon, Q 


Q= R(E(w)). (3.132) 


depends solely on the complex dielectric function of the material and is independent of the 
geometry of the nanostructure. This result assumes the quasi-static limit in which dimen- 
sions of a nanoparticle are much smaller than the wavelength of the electromagnetic field. 
It also assumes that the dielectric part responsible for plasmonic resonance is lossless. 
The dielectric part, although essential for surface plasmons, plays a less significant role 
in determining the sharpness and quality of a plasmonic resonance. For typical metals, 
S[e(w)/€o] is in the range of 2-10. Surface plasmons become well pronounced only when 
Sle(w)] >> —R[e(@)], while losses remain relatively small [224]. 

Although our focus here is on the localized surface plasmons, it is instructive to also 
consider propagating surface plasmons, known as the surface-plasmon polaritons (SPPs) 
because of their coupling to an electromagnetic field. Their study provides useful length 
scales to estimate the behavior of localized surface plasmons. SPPs propagate along a 
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metal—dielectric interface and suffer from losses in the directions both parallel and per- 
pendicular to this interface. The conditions for the existence of SPPs can be written 
as [29] 


Ed€m <0 and e€g+ém <0, (3.133) 


where €g and €m are the permittivities of the dielectric and the metal, respectively. The 
wavelength A) and the propagation distance ô can be derived from the dispersion relation 


of SPPs: 
EdEm 
ky = ko, /-——-. (3.134) 
l S (€d + €m) 


where kg = 27x /àọ is the free-space wave vector. Taking the real and imaginary parts of 
kj, we obtain [225] 


Ay = 27 /N(Ky), by = [29(ky t. (3.135) 


The penetration depth of a SPP on the two sides of the metal—dielectric interface is 
calculated using 6g) = [Skat IW! and ômı = [S(km1)]!. These two quantities are 
related to the permittivities of the metallic and dielectric materials as 


Kin = y €m/€akj, kal = y €d/€mkļ- (3.136) 


For a metallic nanoparticle with a size much smaller than the ôm, the optical field 
penetrates the entire system and drives oscillations of the metal electrons. 

The partition of energy in such a system into the electric and magnetic parts can be 
understood by the following intuitive argument [226]. If the particle is purely dielectric, 
the stored magnetic energy, Uy ~ 5H? /c,/€o, is much smaller than the electric energy, 
UE ~ JEE’, where E and H are the fields inside the particle. To sustain oscillations, 
each energy-storage mechanism must be able to fully store the energy in the other format. 
Owing to an energy-storage imbalance, a dielectric nanoparticle fails to function as a self- 
sustaining oscillator. However, if one introduces a free carrier (thus making the particle 
metallic), a part of the energy can also be stored as the kinetic energy of electrons given by 
UR 5 €0(@p)(@)E?. This opens the possibility of realizing energy balance at a frequency 
at which Uy + Ux = Ug is satisfied. This frequency is precisely the plasma frequency of 
the nanoparticle. In this situation, the energy is mostly contained in the kinetic oscillations 
of electrons. It is the size of the nanoparticle and free-electron density that define the spatial 
scale of the localization of optical energy and govern it through the quality factor Q given 
in (3.132). As a result, optical fields can be confined to nanosize dimensions with their 
spatial distribution scaling with the system’s size. This physical picture is at the heart of 
modern nanoplasmonics [184, 224]. 

The preceding analysis shows that the parameters that govern the formation of surface 
plasmons are fixed in the case of a metal—dielectric interface. The situation changes if we 
consider 2D materials such as graphene, for which a few key parameters that are respon- 
sible for sustaining plasmon oscillations can be tuned through chemical doping or by 
changing the electric and magnetic fields [227, 228]. Indeed, by varying such parameters, 
plasmons can be excited in 2D materials at frequencies ranging from microwave to the opti- 
cal region, significantly surpassing the relatively low range of frequencies covered by the 
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noble metals. Interestingly, it has also been observed that highly doped graphene exhibits 
lower losses and longer plasmon lifetimes compared with the noble metals [229, 230]. 
To understand how this is possible, we need to look at the features that make graphene a 
suitable candidate for sustaining surface plasmons. 

Consider a graphene sheet (or another 2D material) surrounded by two dielectric media 
with permittivities €14 and €2g. We assume that the graphene sheet lies in the (x, y) plane, 
with the z-axis perpendicular to this plane. For both the TE and TM surface waves, we 
assume that the electric field has the form 


E(x, y, z, t) = Em exp(ikx — iot — kj.1z), m = TE or TM, (3.137) 


where j = 1, 2 for the two dielectric materials surrounding the graphene sheet. The x and y 
axes are oriented such that the TE wave has its electric field oriented along the y direction. 
For TM waves, E™ lies in the (x, z) plane and has nonzero components along both the x 
and z axes. The two surface waves propagate in the (x, y) plane such that kj = k? it kõeja 
for j = 1,2. The general dispersion relations for the two surface waves in graphene are 
given by 
E E iw€ia/ki, + Ara for TM, (3.138) 
(ki, + k2,1)/iwpo for TE, 
where Ographene(@) is the conductivity of graphene at the frequency w. 

The preceding equation shows that the plasmonic features of graphene are mainly set 
by the imaginary part of its conductivity. It is possible to change the sign of the imaginary 
part by changing graphene’s chemical potential [227]. When the imaginary part is positive, 
graphene shows metallic features and can support SPPs of the TM type under the right 
conditions [209, 220, 231, 232]. However, when the imaginary part is negative, graphene 
loses this ability but may be able to sustain a weak TE-type surface wave [220, 232]. It is 
instructive to look at the special case in which the same dielectric medium surrounds the 
graphene sheet on both sides (1.e., €1g = €24 = €q). The dispersion relation in Eq. (3.138) 
in this case takes the form [220, 231, 233] 

er oe nee /k,, for TM, (3.139) 
2k; /iwpo, for TE, 
where we used kj = ki, = k2,1 . The propagation constant along the interface is given by 
ky = Ki + ako. 

One can evaluate Eq. (3.139) approximately to gain some insight into the plasmonic 
behavior of graphene. Two figures of merit have been suggested to classify plasmonic 
materials [234]. The first one describes the confinement of the surface wave with respect to 
the vacuum wavelength and is given by FOMeon¢ = (27 c/w) (kK). The second measure 
provides a normalized value for the SPP’s propagation distance by introducing FOMprop = 
RK, )/[273(ky)]. We calculate these figures of merit for highly doped graphene using 
the conductivity given by the Drude model in Eq. (3.119). The dispersion relation in 
Eq. (3.139) then provides k, = 2wéq(w + iy)/oo, resulting in 


4m ceég 


FOMcont = w, — FOMprop = sd. (3.140) 
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These figures of merit tell us that the mode confinement for graphene is independent of 
loss, but depends inversely on the doping level through og = e?Ef/mh? (see Table 1.2). 
In contrast, increased doping tends to boost the FOMprop, but there is an upper bound on 
frequency, after which the propagation distance begins to decrease. 

The preceding dielectric models can be refined using information gained through the ab 
initio methods that solve variants of the Schrödinger equation. A full quantum-mechanical 
treatment of the dielectric function would require the calculation of the material’s wave 
function using multiple electrons and nuclei. However, considering the large mass of the 
nuclei relative to electrons, it is possible to use the Born—Oppenheimer approximation and 
decouple electron’s motion from nuclei motion [222]. Even with this approximation, a full 
quantum-mechanical treatment remains difficult owing to the computational complexity, 
and further approximations are often made. In one approach, density functional theory is 
used with the local-density approximation for the exchange and correlation functions [235]. 
In another approach, a semianalytical dielectric function is constructed that depends on the 
jellium model for electrons that are confined by infinite potential barriers at the physical 
edges of the nanosize object [236]. 


Dissipation and Decoherence 


I don’t demand that a theory correspond to reality because I don’t know what it is. 
Reality is not a quality you can test with litmus paper. All I am concerned with is that 
the theory should predict the results of measurements. Quantum theory does this very 
successfully. It predicts that the result of an observation is either that the cat is alive or 
that it is dead. It is like you can’t be slightly pregnant: you either are or you aren’t. 
Stephen Hawking 


4.1 Effect of Environment on a Quantum Device 
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All quantum devices operate at a finite temperature and are thus subject to thermodynamic 
laws. The second law of thermodynamics states that a state variable called entropy exists 
and its value changes with time because of external and internal perturbations such that 
[237] 


dS = dS | ext + dS\int > (4.1) 


where the internal change is always positive: dS|j,, = 0. The external change in entropy 
depends on temperature as dS|ext = 6Q/T, where 6Q is the amount of heat entering the 
closed system and T is the common temperature at the point where the heat transfer took 
place. This change vanishes, by definition, for a “thermodynamically closed” system for 
which 6Q = 0. For a given physical process, entropy S of the whole system (including 
its environment) remains a constant if the process is reversible. An example of a reversible 
process is a Qubit register, which is a quantum circuit made of multiple reversible quantum 
gates (analogous to the registers used in electronic computers) (see Aside 2.10). Another 
example is the flow of electric current through a wire with zero resistance. In contrast, the 
combined entropy of a thermodynamically closed system (including its environment) can 
only increase when the underlying physical process is irreversible because dSļint > 0. This 
is the reason why heat flows from the hot side to the cold side, never the reverse. Similarly, 
entropy increases if an electric current flows through a wire with finite resistance. 


4.1.1 Entropy and Time Reversal 


As entropy provides a way of measuring disorder in a system, the second law of thermo- 
dynamics can be restated as follows: The entropy of an isolated system will either increase 
(more disorder) or stay the same, but it can never decrease. Strikingly, this requirement 
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imposes a direction on time, in contrast to other physical laws, which could possibly be 
time reversed. Most physical laws do have time-reversal symmetry built into them, that is, 
if one replaces tf with —t in their equations of motion, the transformed equations remain 
consistent with the physical law they represent. As a result, they can predict the state of a 
dynamical system in the past or in the future, just by knowing its present state, which pro- 
vides the initial conditions for integrating the system’s equations of motion. Clearly, the 
second law of thermodynamics breaks this feature. The British astronomer Arthur Edding- 
ton coined and popularized the term time’s arrow to refer to the “one-way nature” of a 
thermodynamic system [238]. The “arrow of time” is one of the most salient features of 
our universe because it is intimately related to entropy, which demands this asymmetry. 
However, why this is so is one of the unsolved mysteries in modern science. The laws of 
physics have no control over the direction of time; they depend on it! 

What we do know is that quantum mechanics provides accurate predictions at the micro- 
scopic level. However, governing physical laws at the microscopic level appear to be 
entirely symmetric in time, that is, if the direction of time were to reverse, the analysis 
and predictions based on quantum mechanics would remain true. This feature is obviously 
contradictory to our day-to-day perception because we perceive an obvious direction (or 
flow) of time. 

Time reversal in the quantum regime requires careful implementation because it is not 
sufficient to just change the sign of the time variable. Wigner was the first person to con- 
sider how the Schrödinger equation behaves under time reversal [239]. He considered a 
Hamiltonian that was invariant under time reversal and found that the time-reversal sym- 
metry was restored by complex conjugation of the wave function. The reason for this can 
be understood by using Wigner’s antiunitary operator T defined as Tc |q) = c* |q} for any 
complex number c. Owing to this definition, when the operator T acts on two wave func- 
tions, say |@) and |y), it gives the result (T |@) | T|w)) = (W|@). Recall that (y|@) is the 
conjugate of (ġ| y). 

To see how the antiunitary operator leads to time symmetry, consider a state |p) of con- 
stant momentum and express it in the coordinate basis as |p) = Tm J exp(ipg) |g) dq. It 
follows that 


1 1 
T|p) = Tn f Texp(ipq) |q) dq = Jin f exp(—ipq) |q) = |—p) . (4.2) 


That is, the particle’s momentum is flipped under the action of the T operator. In the 
Schrödinger picture, the Hamiltonian H remains invariant under time reversal because it is 
an even function of the momentum operator. However, we should consider how the state 
w(t) is affected by the operator T. Using the time-evolution operator, U(t) = exp(—iAt/h), 
it is easy to show that 


T y(t) = TUY) = U(—)y* (0). (4.3) 


Thus, the correct recipe for time reversal is to reverse the sign of t followed by complex 
conjugation of the wave function. For unhindered quantum flow, future and past are mere 
conventions resulting from the same governing equations of motion. 
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The next question is: what are the implications to the physical laws if time is not 
reversible? Intuitively, a process should be reversible if there is no loss of coherence or 
energy. Indeed, dissipation and decoherence are two irreversible processes common to any 
quantum system interacting with its environment. Here, dissipation refers to any process 
for which irreversible loss of energy happens. Decoherence is the phenomenon where a 
superposition of macroscopically distinct states loses its coherence because of phase fluc- 
tuations. Phase variations reduce the ability of different wave functions to interfere with 
each other because of the interaction of a quantum system with its surroundings. Even 
though these two processes are fundamentally different and occur on different time scales, 
the net effect of both on a quantum system is to produce an irreversible change in the mea- 
surable quantities of interest. In almost all cases, it is possible to separate a quantum system 
into a relevant part (representing a useful quantum device) and an irrelevant part (called 
the reservoir) that are coupled to each other. Often, the reservoir is a broadband system with 
a large number of degrees of freedom that is in thermal equilibrium with its environment. 
For this reason, the reservoir is also called a (heat) bath. In general, a quantum device has 
little influence on its reservoir because of the reservoir’s many degrees of freedom, but the 
reservoir affects the device through dissipation (energy loss) and phase fluctuations. These 
fluctuations are included in the dynamics of the quantum device through an effective force 
whose characteristics depend on the reservoir and induce decoherence that increases the 
device’s entropy through disorder. 


4.1.2 Spontaneous and Stimulated Emissions 


The simplest yet most instructive example of irreversible decay is the phenomenon of spon- 
taneous emission from an excited atom or molecule. The irreversibility in this process can 
be traced back to a relatively short memory of the reservoir (the Markovian property) and 
the resulting exponential decay of the probability for the atom being in an excited state. 
This suggests that qualitatively different dynamics can be achieved if the conditions are 
changed. For example, it is possible for an atom coupled to a single mode of the electro- 
magnetic field to undergo periodic exchange of energy between the atom and the field, 
resulting in the well-known Rabi oscillations. To understand the process of spontaneous 
emission, it is interesting to consider how the underlying theory evolved over time. 

In the early days of quantum mechanics, scientists discovered that energy associated 
with an atom is quantized through discrete energy levels. Bohr proposed a model of atomic 
structure based on such discrete energy levels, connected through emission of photons of 
energy iw = Em — En between two energy levels with Em > En. However, this process is 
feasible only if there is a mechanism to excite a low-energy atom to a higher energy state. 
One mechanism is the absorption of photons that can raise an atom to a higher energy 
level. However, such an upward transition can occur only at discrete frequencies for which 
the energy of photon Aw equals the energy difference Em — En of the two states involved. 

As the absorption and emission processes seem to be connected with each other, Einstein 
made the first attempt in 1917 to find a relationship between them. Rather than considering 
a multilevel atom, he considered a simple system with only two energy levels: a ground 
level with the quantum state |g) and an excited level with the quantum state |e). The energy 
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difference between these two states is Ee — Eg = ħæeg. Einstein first considered only 
two processes: (1) absorption that can only occur in the presence of a radiation field and 
(2) spontaneous emission that can bring the atom to the ground state even without the 
presence of a field. If such a system is in thermal equilibrium at a fixed temperature T, 
it should satisfy the Boltzmann distribution, pe = pg exp(—hweg/kpT), where pe and pg 
are the population densities of the two states involved. Moreover, the radiation field in 
thermal equilibrium must also satisfy Planck’s blackbody radiation formula (see Aside 
3.1). Einstein realized that it was not possible to satisfy both the Boltzmann relation and 
Planck’s blackbody radiation formula by limiting a two-level system to just absorption and 
spontaneous emission. His remedy was to postulate a third process known as stimulated 
emission, where a photon in the radiation field induces transition from the upper to lower 
energy level, followed up by the emission of a new photon. The emitted photon has exactly 
the same energy, momentum, and phase as the photon that induced the transition. Aside 
4.1 provides details of Einstein’s derivation based on the above argument. 


Aside 4.1 Einstein’s A and B Coefficients 

Einstein’s A and B coefficients represent the probability of absorption or emission of light 
from an atom [240]. The A coefficient is related to the rate of spontaneous emission and the 
B coefficient is related to the rate of absorption and stimulated emission. They are defined 
for a two-level atom with the lower energy level E; in the quantum state |g) and the upper 
energy level Ee in the quantum state |e). The energy difference between them is written 
as Ee — Eg = ħweg = hveg. The system is in thermal equilibrium with the surrounding 
blackbody radiation. Transitions between the two energy states can occur in three different 
ways: 


e spontaneous emission from |e) to |g) with probability A per unit time; 
e absorption from |g) to |e) with probability B.-Rq per unit time; 
e stimulated emission from |e) to |g) with probability B.,Rq per unit time; 


where A, Beg, and Boe are constants and Ry is the spectral density of radiation at the 
frequency v = Veg. 
Given that the system is in thermal equilibrium, the occupation probabilities for the states 
|e) and |g) are exp(—E./kgT) and exp(—E,/kgT), respectively. Also, the probability of 
upward transitions must exactly balance the probability of downward transitions in thermal 
equilibrium, 

(A + BegRa) exp(—Ee/kpT) = ByeRg exp(—Eg/kpT). (4.4) 
This equation can be rearranged to get 


dpy A 


Rag= = : 
dv Bge exp(hveg/kBT) — Beg 


(4.5) 


As this expression must coincide with the spectral density given in Eq. (3.25), we imme- 
diately obtain Beg = Bge = B, indicating that the absorption and stimulated emission 
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processes are intimately connected. Further, the A and B coefficients are related as 


A 8h 

~= (4.6) 

B c 
This relation enables us to calculate Einstein’s B coefficient once the spontaneous emission 
rate A is known. Although it is possible to use semiclassical arguments to estimate the value 
of A, its proper derivation requires quantum electrodynamics and is discussed later in this 
section. 


Even though Einstein’s treatment of the emission and absorption processes enabled him 
to establish relations between the A and B coefficients within a thermodynamic framework, 
the values of these coefficients cannot be calculated by resorting to semiclassical argu- 
ments. What one needs is an expression for spontaneous-emission rate A, because B can 
be obtained from it using Einstein’s relation, given in Eq. (4.6). The hurdle for calculating 
A using a semiclassical approach stems from ignoring changes in the electromagnetic field 
induced by the emitted photon. Indeed, even after multiple attempts, no one has ever suc- 
ceeded in deriving the rate of spontaneous emission solely using the classical description 
of an electromagnetic field [241]. Moreover, even the Schrödinger equation of quantum 
theory cannot be used to calculate the rate of spontaneous emission. To accomplish this 
task, one has to invoke the machinery of quantum electrodynamics. 

Spontaneous emission, at the most basic level, is responsible for most of the light we see 
all around us. It is so omnipresent that there are many names associated with what is fun- 
damentally the same thing. For example, if a material is excited by some means other than 
heating, spontaneous emission is called luminescence, which may also assume other names 
depending on the underlying process. Thus, chemoluminescence creates light through a 
chemical reaction. Bioluminescence is a type of chemoluminescence used by living crea- 
tures: Angler fish produce light lures to trap prey, and fireflies glow at night. However, 
if a material first absorbs light and then emits a part of it through spontaneous emission, 
the radiation is called fluorescence. If a material has a metastable state and continues to 
fluoresce long after the incident light causing excitation is turned off, we call it a phospho- 
rescent material. Such materials are useful for making dials in clocks and watches. Even 
though we may not appreciate it, spontaneous emission is behind the multibillion-dollar 
industry of light-emitting diodes and lasers. Although lasers produce light mainly through 
stimulated emission, the trigger for stimulated emission comes from the photons emitted 
via spontaneous emission in these devices. Thus, not only does spontaneous emission set 
the scene for all fundamental radiative interactions, it also triggers these processes in many 
important quantum devices. 

Ever since it was understood that absorption, spontaneous emission, and stimulated 
emission must coexist if matter and radiation were to achieve thermal equilibrium, it was 
realized that an atom in the excited state will inevitably radiate energy through spontaneous 
emission because it is the only process that does not requires any background radiation 
to be present. A consequence of this reasoning is that spontaneous emission must be an 
intrinsic property of matter [242]. This view, however, overlooks the fact that spontaneous 
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emission is not a property of an isolated excited atom but a property of an atom coupled to 
its surroundings. The most distinctive feature of this emission, irreversibility, results from 
an infinite number of radiation modes (or states) available to the spontaneously emitted 
photon. If these states are modified, for example by placing the excited atom between mir- 
rors of a cavity, spontaneous emission can be enhanced, or even inhibited. Purcell predicted 
in 1946 the ability to control the rate of spontaneous emission by changing the environ- 
ment surrounding an excited system [243]. Since then, many theoretical and experimental 
studies have focused on various ways to influence the rate of spontaneous emission from 
an excited system [244, 245, 246]. 

Kleopnner extended Purcell’s work in 1981, and this work [247] led to a new field called 
cavity quantum electrodynamics (or cavity QED). He showed that spontaneous emission 
is inhibited when the cavity’s dimensions are small compared to the radiation wavelength 
and is enhanced if the cavity is resonant with the excited atomic system. In addition, the 
cavity changes the energy levels slightly in a manner analogous to the well-known Lamb 
shift. These properties opened up the possibility of tailoring spontaneous emission to suit 
applications. For example, in the case of a laser, it is desirable to limit spontaneous emis- 
sion to modes that are lasing; in the case of solar cells, spontaneous emission is allowed in 
the detected modes [248]. An interesting example is a photonic crystal in which an excited 
quantum dot is placed inside a 3D periodic structure containing a photonic band gap that 
overlaps the electronic band edge of the quantum dot; spontaneous emission is found to be 
rigorously forbidden in such a system [248]. 

As we have mentioned, the ability to control spontaneous emission has applications 
for improving the efficiency of semiconductor devices such as solar cells, lasers, LEDs, 
spasers, and many other active quantum devices that rely on transitions between two 
excited states. Therefore, a fundamental understanding of the theory and techniques 
required to analyze such devices is critical. Dirac was the first person to derive an expres- 
sion for Einstein’s A coefficient for spontaneous emission based on quantum mechanics 
[249]. A more complete and insightful model was developed later in 1931 by Weisskopf 
and Wigner [250]. Their derivation, discussed next, has numerous applications beyond 
spontaneous emission because it is an example of a general class of problems that consider 
quantum-mechanical coupling of a small device to a large reservoir. 


4.1.3 Weisskopf—Wigner Theory 


A quantum system in an excited energy state cannot remain there indefinitely: it eventually 
makes a transition to a lower energy state by spontaneously emitting a photon. This tran- 
sition is driven by the coupling of the excited atom to the reservoir surrounding it. Even 
when a vacuum acts as a reservoir for an excited atom, one must consider the atom’s cou- 
pling to the radiation modes of the vacuum. Weisskopf and Wigner considered this special 
case to calculate the rate of spontaneous emission from an excited two-level atom (a small 
device) coupled to a continuum of electromagnetic modes of vacuum (a large reservoir). 
Consider a quantum system with two energy states: the ground state |g) with energy Eg 
and an excited state |e) with energy Ee. The energy difference between these states can 
be written as Ee — Eg = hweg. The vacuum surrounding this system acts as a reservoir. 
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Coordinate system used for representing a reservoir mode with the wave vector k. The associated plane wave is 
polarized in the transverse plane. The atomic dipole moment is oriented at an angle 8 as shown. 


Suppose the electromagnetic field in this reservoir is in its lowest energy state, denoted as 
|{O}). When this state is excited with one photon, the associated plane wave is polarized in a 
plane orthogonal to its direction of propagation governed by the vector k, whose magnitude 
is related to the frequency w of the radiation as k = w/c. We denote unit vectors in this 
plane by fı or f2, as shown in Figure 4.1. 

We use the state vector |w(0)) to denote the initial state at t = 0 and define it to be a 
product state of the excited state |e) and the vacuum state with zero photons: 


IY (0) = le, {0}) = le) 8 |{0}) . (4.7) 


When the atom makes a transition from this excited state to its ground state |g), a photon 
will be emitted in the direction k with polarization ps (s = 1,2). The final state of the 
system can be written as |g, lks) = |g) ®|1ks), where we used the compact notation | 1s) to 
identify this single photon. The initial and final states form a complete basis for expanding 
the quantum state of the whole system at time t as 


IYD) = ate He! Je, {0}) + Y bise? |g, Iks). (4.8) 
ks 


Note the sum over all possible directions and polarizations of the emitted photon. 

We need to find how the coefficients a(t) and bx. s(t) evolve with time, given that a(0) = 1 
and bxs(0) = 0 initially. This can be done by using the Hamiltonian H of the whole system. 
We use an extension of the Jaynes-Cummings Hamiltonian (see Section 4.1.4) to describe 
the atom-field interaction in the form 


1 At A 
H= 5 hes F 5 Neng, As + Hint, (4.9) 
ks 


where the first two terms represent the Hamiltonian of the atom and the field, respectively, 
and Hin; describes the interaction between them. In the dipole approximation, Hin = —d - 
E, where d is the dipole moment operator and E is the electric field. Using the quantized 
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form of the electromagnetic field, 


ho, 
E= 2 7 (aj, $ as) B Ds. (4.10) 


where V is the volume of the reservoir, the interaction Hamiltonian can be written as 
Hin = — h (Chesdis le) (gl + châi, Ig) (el) ? (4.11) 
k,s 


where the coupling coefficient is defined as 


Cks = ((eld|g) - ps). (4.12) 


Wk 
2heoV 
We can now use the Schrédinger equation, ine lw(t)) = H|W(), to obtain 


da b ; 
in( = = O i" le, {(0}) + es (= - icokbus Jew! Ig, 1ks) (4.13) 
JS 


= H| a(t) exp iogh) le, (0}) + > bus(O exp ior) lg, las) |. 4.14) 
k,s 


We multiply the preceding equation first with (e, {O}|, then with (g, 1,,|, and use the 
orthonormal property of the quantum states. After some algebra, we obtain 


da f 
x” i$ Crs exp[—i(wk — weg)tlbks(t) (4.15) 
ks 
dbs _ 
a = iC% expli(wk — weg)tla(t). (4.16) 
To solve the preceding set of equations, we first formally integrate Eq. (4.16) to obtain 
t 
bks(t) = ic f explilwk — Weg)t Ja(t’) dt’, (4.17) 
and then substitute the result back into Eq. (4.15) to get 
da(t : 
ae) =— > Ica? f exp[—i(a@x — Weg)(t — a(t) dt’. (4.18) 
dt ke 0 


Equation (4.18) is an integral equation for a(t) involving summation over all reservoir 
states. It can be solved only after making some reasonable approximations. Consider first 
the time integral: 


t 
I) = f exp[—i(@g — Weg)(t — 1 )a(t’) dt’. (4.19) 
0 


The exponential term in this equation oscillates rapidly unless wg © weg or f’ is close to t. 
Physically, the excited-state amplitude a(t) is expected to vary much slower than these rapid 
oscillations. As the largest contribution to the integral comes for f’ close to t, we can replace 
a(t’) in the integrand by a(t) and take it outside the integral. This is called the Weisskopf- 
Wigner approximation. It can also be recognized as the Markovian approximation because 
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the dynamics of a(t) does not depend on earlier times ft’ < t (the system has no memory of 
the past). Using t = t — f’, the time integral becomes 


t 
lalt) © ato f exp[—i(@e — Weg)T] dt. (4.20) 
0 


Since the largest contribution to this integral comes from the region near tT = 0, we can 
extend the upper limit to co without introducing much error. This integral then can be 
evaluated using the Sokhotski—Plemelj formula given in Aside 3.3. The result is 


Iq(t) = a(t) [raor — weg) — iP.V. (=) , (4.21) 


Wk — Weg 


The sum on the right side of Eq. (4.18) can be done in the continuum limit by following 
the details in Aside 4.2. 


Aside 4.2 The Continuum Limit 

We calculate the sum Doks |Cks|7 in the continuum limit (quantization volume V —> 00) 
where the number of reservoir states becomes infinite. In this limit we replace the sum over 
k by an integral as 


2 
lP > Y i |Cis|? Dos(k)d’k, (4.22) 
ks s=1 


where Dos(k) = V/ (27)? is the density of states in the k-space (see Aside 1.4 for details). 
The volume integration in the k space can be done using the spherical coordinates (k, 0, 6) 
to obtain (see Fig. 4.1) 


2 
Ok a 12 3 
IC tE] | (eld|g) - sl? Dos(k)d?k 
> ks D1 Theo? 8) * Ps os 


k,s 
© cogk? dk nx pn 2 o; 
~ Jo 22rPħeo dlg) - Psl sin dedo |. (4.23 
[ (27) he [ Í dX | (el |g) Psl sın (0 ( ) 


Consider the sum over s within the square brackets. Using 


2 
Yo I (eldig) «Bs? = [I (eldig) - Bil? + | (eld|g) - 2/7] (4.24) 
s=1 
and noting that (p1, f2, k) form a Cartesian coordinate system (see Fig. 4.1) we obtain 


2 
X I (eldig) - sl? = | (eldlg) |? — |(eld|g) cos 0|? = | (eld|g) |? sin? 0. (4.25) 


s=1 


Using this result and noting that 


1 20 Sar 
f / sin’ 6 d0 dd = —, (4.26) 
o Jo 3 


130 


Dissipation and Decoherence 


the sum in Eq. (4.23) becomes 


| (eld|g) |? 
pelea aad [ cp dak, (4.27) 
0 


6n2hepc3 


where we have changed the integration variable from k to wz using œk = ck. 


We can now solve Eq. (4.18) approximately. Using the results in Eqs. (4.21) and (4.27), 
we obtain 


da Ya A 
a (> — iAwrs) a(t), (4.28) 


where the decay rate y, and the frequency shift Awzs are defined as 


wl (eld|g) |? 


= 2 4.29 
va 3xeohe? ew 
| (eldig) |? f coy, dong 
A = ———_~PV. | ——.. 4.30 
LS 67 ?eoħc? Wk — Weg t ) 
Equation (4.28) can be easily integrated to obtain the final result 
a(t) = a(0) exp[—(Ya/2 — tAwzs)t)]. (4.31) 


It shows that the excited-state amplitude decays exponentially. Its phase also changes 
because of the frequency shift Awzs known as the Lamb shift. The rate of spontaneous 
emission is related to the probability of the atom remaining in the excited state given by 
|a(t)|? = exp(—ygt). Indeed, y is just Einstein’s A coefficient. 

As mentioned earlier, the Schrödinger equation cannot describe the process of sponta- 
neous emission if we just focus on the two-level system and ignore its surroundings acting 
as a reservoir. The reason is that the Schrodinger equation describes processes that are time 
reversible. Spontaneous emission describes an irreversible process that transfers energy 
from the atom to its surroundings by emitting a photon into one of the reservoir states. 
The Weisskopf—Wigner theory introduces irreversibility into the Schrödinger equation by 
invoking two approximations: (1) the available reservoir modes cover a very broad spec- 
trum and (2) the excited-state amplitude a(t) depends only on ¢ and not on earlier times 
(i.e., the system has no memory of the past). 

The solution in Eq. (4.31) can be used to estimate the spectrum of spontaneously emitted 
emitted light. As the Lamb shift Awys is relatively small, we ignore it here. Using the result 
for a(t) in Eq. (4.16), we obtain 


dbys(t) _ 
a = iCk, expli(@x — Weg)t — Yat/2]a(0). (4.32) 
This equation can be easily integrated to obtain 
bus(t) Cus) T w= yat/2\-1). 433) 
= : ex ; ; 
- Ya/2 — i(@k — Weg) poe ae 4 
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Spontaneous-emission spectrum resulting from the Weisskopf—Wigner theory. It is centered at weg and has a 
Lorentzian shape with the FWHM y4. 


After a long time, the probability finding the system in the state |g, ls) is thus given by 


|Cis|71a(0)|? 


EE EE (4.34) 


lim |bks(®|? = 
t—œ 


This result shows that the spectrum of the emitted light has a Lorentzian shape and is 
centered at the frequency weg. This spectrum is shown in Figure 4.2, and its full width at 
half-maximum (FWHM) is given by ya. 


4.1.4 Jaynes—Cummings Hamiltonian 


The Weisskopf—Wigner theory describes the evolution of a two-level system coupled to a 
large number of the radiation modes of a reservoir. The simplest example of this scenario 
is the Jaynes-Cummings model, where a two-level atom is coupled to a single electro- 
magnetic mode of the reservoir. Here we discuss the Hamiltonian associated with this 
model. 

As before, we consider an atom with two energy states: the ground state |g) with energy 
Eg and an excited state |e) with energy Ee. The transition frequency is related to the energy 
difference as Ee— Eg = Nweg. We assume that the frequency wọ of the electromagnetic field 
is nearly resonant with Weg. The assumption œk ~*~ Weg ensures that only one mode cou- 
ples strongly to the two-level atom. The interaction of the atom with this mode is studied 
through a Hamiltonian containing three parts 


Ayc = Heg + Hy + Hegf, (4.35) 


where Heg is the Hamiltonian of the two-level atom, Hy is the Hamiltonian of the 
electromagnetic field, and Hegr is the interaction Hamiltonian. 

It is common to use the Pauli spin matrices to write the Hamiltonian of a two-level atom 
[251]. The raising and lowering operators, o+ and o_, introduced in Aside 4.3, can be used 
to obtain the following relations [252]. 


o+le)=0, o+lg)=le),  o-|g)=90, ole) = |8), (4.36) 
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[03,04] = 204, [o4,0_] = 03, o? =o= 0, oÍ =0_. (4.37) 


The Hamiltonian of a two-level system can be written in the form 


1 1 
Heg = zħoeg( le) (el — |g) (gl) = zħoeos, (4.38) 


2 2 
where we chose the origin of energy halfway between the two energy levels such that 
Ee = hweg and E = — i hweg. 


Aside 4.3 Pauli Spin Operators and Matrices 


A spin-4 system and a two-level atom have identical Hilbert spaces. The states |g) and |e) 
form an orthonormal basis such that (g|g) = (ele) = 1 and (gle) = (e|g) = 0. We define 
the Pauli spin operators in this basis as 


o1 = |g) (el + le) (gl, 02 = i(l8) (el — le) (gl), 93 = le) (el — Ig) (gl (4.39) 
o+ = le) (g| = (01 + io2)/2, o- = |g) (e| = (1 — to2)/2. (4.40) 


With these definitions, it is easy to show that the Pauli spin operators (01, 02,03) satisfy 
the following relations: 


(ox) =1, Tr[ox] = 0, Lom, On] = 2i€mnkCk. (4.41) 


If we represent the states |e) and |g) in a matrix form as 


the Pauli spin operators have the following equivalent matrix representation: 


0 1 0 =i 1 0 0 1 0 0 
asl ael, of = [5 ias ee of 43) 


The field Hamiltonian Hy requires the quantized form of the electric field, 


hog at x a 
E= Jay (a, + fxs) Bs, (4.44) 


where p, is the polarization unit vector (s = 1 or 2) and al and a, are the creation and 
annihilation operators for the field at the frequency wg with k = a;/c. The constant factor 
represents the average field per photon inside a cavity of volume V. In this representation, 
the Hamiltonian of the electromagnetic field is given by 


Hy = hoody, aks, (4.45) 


where we have discarded the zero-point energy because it plays no role in the present 
situation. 
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The interaction energy in the dipole approximation is given as V = —d.- E, where 
d is the dipole-moment operator. In the current basis, we can write the interaction 
Hamiltonian as 


Hegf = |e) {el V |8) (gl + l8) (gl V le) (el. (4.46) 
Using V = —d- E and E from Eq. (4.44), it becomes [253] 
— Fe : 
Hg = —h (a, + as) [Ckso4 + CK,o-], (4.47) 


where we have defined Cx; as 


Wk 


Ck, = 
ks 2heoV 


[(eld|g) - Bs]. (4.48) 


This quantity is often referred to as the Rabi frequency (especially when it is real). We keep 
the present notation to allow for its complex values. 

We can further simplify the interaction Hamiltonian by invoking the rotating-wave 
approximation. In this approximation, the rapidly oscillating terms are neglected. More 
explicitly, terms oscillating at the low frequency wx — Weg are kept, while terms oscillating 
at the high frequency wk + weg are neglected (as their contribution averages out to zero 
over a few cycles). To understand the origin of these frequencies, we need to consider the 
evolution of the operators appearing in Eq. (4.47) in the Heisenberg picture. It is easy to see 
from Eq. (4.45) that ax, evolves in time as exp(—i@,t). Similarly, the atomic Hamiltonian 
Heg can be used to show that the operators o+} = |e) (g| and o_ = |g) (e| evolve in time as 
exp(i@egt) and exp(—iwegt), respectively. It follows that the combinations ûkso+ and âl o- 
oscillate at the frequency wk — weg. In contrast, the combinations Âûķkso— and âl o4 oscillate 
at the frequency wk + Weg. In the rotating-wave approximation, we discard the later terms 
and write the Jaynes—-Cummings Hamiltonian as 


1 ESS . ae 
Hjc = 5 h@eges + hori, ays — hi (Casino F Ckâkso-) l (4.49) 


Physically, âķso+ corresponds to the absorption of a photon that raises the atom to its 
excited state and âl o- corresponds to the emission of a photon that returns the atom to 
its ground state. These two terms conserve the total energy of the system. In contrast, the 
nonresonant terms that we dropped do not conserve energy. The accuracy of the rotating- 
wave approximation increases when the electromagnetic field is nearly resonant with the 
two-level atom and has a relatively small amplitude. 

Several methods can be used to study the dynamics of the Jaynes-Cummings Hamil- 
tonian [254, 255]. In a widely used method, we first find the stationary states of this 
Hamiltonian, known as the “dressed states,” and then write the state of the system as a 
superposition of these dressed states [256]. In a variant of this method, called the Stenholm 
method [257], the problem is solved by noting that integer powers of the Hamiltonian 
can be calculated using analytical methods. Open quantum systems can be modeled by 
replacing the pure-state description of the system with a density-matrix description. 
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Closed System H=H,+ H; + H, 


H 


D 


H, 
Reservoir 


The device plus reservoir (D + R) model commonly used for quantum devices. The combined Hamiltonian of the 
closed system includes a weak coupling term H;. The reservoir has many degrees of freedom and is not affected by the 
device. 


4.1.5 General Reservoir Model 


The time evolution of any closed quantum system is governed by the Schrödinger equation. 
As discussed in Section 2.2, this equation shows that the quantum state |(f)) evolves from 
its initial state |Y (to)) as 


| (t)) = UC, to) |W(t0)) . (4.50) 


where U(t, to) = exp |- iH (t— to) is the unitary evolution operator. The Hamiltonian H 
of a closed system does not depend on time. Indeed, H cannot be a function of time, as 
this would imply an external mechanism driving the time dependence that does not exist 
for a closed system. As the Hamiltonian refers to total energy in the system, it follows that 
closed systems cannot dissipate energy as they evolve in time. 

The key question that needs to be answered is: How can one incorporate dissipation and 
decoherence within the quantum-mechanics framework? There have been many attempts 
to incorporate dissipation into quantum mechanics. The most celebrated example is pro- 
vided by lasers, which demand a satisfactory method to account for the lossy cavity that is 
required to provide optical feedback. Details about the methods used can be found in Refs. 
[258, 259, 260, 261, 262, 263]. 

It should be clear by now that we need to relax the requirement of the unitary evolution of 
a quantum system to incorporate irreversible changes. We have already seen a way of doing 
this in the Weisskopf—Wigner theory of spontaneous emission. In that case, the radiation 
modes of the vacuum surrounding a quantum system (a two-level atom) acted as a reservoir. 
We can extend this concept to all quantum devices through the scheme shown in Figure 4.3. 
In this scheme, the quantum device is coupled to a suitable reservoir, and it is this coupling 
that induces decoherence and dissipation. To make the problem tractable, we need to make 
three simplifying assumptions: (1) the quantum device couples to the reservoir weakly; (2) 
reservoir dynamics is not affected by this coupling; and (3) the reservoir is memoryless. In 
essence, interaction of the quantum device with the reservoir introduces randomness into 
the device dynamics. As a result, the device is better described using a density operator 
rather than its wave function. 

The device plus reservoir model seen in Figure 4.3 solves the problem by using a Hamil- 
tonian of the form H = Hp + Hr + Ay, where Hp is the device Hamiltonian, Hp is the 
reservoir Hamiltonian, and Hy describes the interaction between the two. However, it is 
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important to separate the device dynamics from that of the reservoir to make the prob- 
lem tractable. This can be done in two ways. One approach is based on the Schrödinger 
picture, and the dynamics is described through a reduced density matrix using a so-called 
master equation [264, 265, 266]. The second approach is based on the Heisenberg picture, 
where the problem is solved using the so-called Langevin equations (see Section 7.3) for a 
relevant set of noise operators of the reduced system [260, 267, 268]. 

Classical physics provides deterministic laws for describing observations and gives a 
sense of certainty even though many processes in nature are stochastic. Historically, the 
advent of quantum mechanics led to a prevailing view that nature is indeterministic. In 
1926, Einstein made the following comment to Max Born on quantum mechanics: “The 
theory produces a good deal but hardly brings us closer to the secret of the Old One. I 
am at all events convinced that He does not play dice.” The probabilistic nature of quan- 
tum mechanics is evident from the Heisenberg uncertainty principle (see Aside 2.11), 
which emphasizes that observable quantities do not have definite values in the quantum 
regime. Clearly, two basic concepts central to quantum phenomena, the quantum states and 
quantum observables, are intrinsically probabilistic. In contrast to this, classical physics 
considers probabilistic models only when one has an incomplete knowledge of the system. 


4.2 Master-Equation Approach 
a ee eee 


The device plus reservoir model in Figure 4.3 requires the use of a density matrix. However, 
owing to the computational cost and the mere size of the full density matrix of a closed 
system, a numerical approach is often impractical. In this section we discuss two types of 
master equations that are used to reduce the complexity of the problem. 


4.2.1 Master Equation for Occupation Probabilities 


In both classical and quantum mechanics, probabilistic descriptions can be used to describe 
the past, present, and future states of a system. The dynamical equations describing the 
time evolution of these probabilities are called the master equations, which are first-order 
differential equations of the form 


dP. 
a = > CmenPn = PremPm), (4.51) 
nzm 


where Pm(t) with m = 1 to M is the probability that the system is in the mth state and 
Imen = Ois the transition rate for moving the system from state n to state m (T mem = 0). 
Since a master equation depends only on the transition rates, it can also be considered a 
rate equation. The first term containing l men represents a gain of probability, while the 
second term with I’, corresponds to a loss of probability for the system to remain in the 
mth state. It is interesting to note that the underlying process can be time reversed if the 
transition matrix is symmetric. 
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If the solution of a set of master equations approaches a steady state after some time, 
the resulting probability P,,(0o) is said to satisfy the detailed-balance condition. In this 
situation, the relation Py, Pm(CO) = Tne_nPn(oo) holds for all states of the system. 
Even though all steady-state probabilities do not depend on time in the steady state, the 
transition rates lmn need not be time independent, provided the detailed-balance condition 
holds for all times after the steady state has been reached. As we have seen in Chapter 3, 
this condition is sometimes used to study multiparticle quantum interactions in devices that 
are in thermal equilibrium with large reservoirs. 

It is easy to show that the master equations conserve the total probability. Indeed if we 
sum the set in Eq. (4.51) over m, we find 


dP 
x a = > pies —TnemPm) = 0. (4.52) 


m m nm 


This means that the sum }_`„ P(t) remains constant over time. As )°,,, Pm equals 1 at 
t = 0, the same value of this sum is maintained for all times. If we also assume that all 
transition rates must be positive in the master equation, the following result holds whenever 
the probability P,,, becomes zero during its evolution: 


dPm 


Gil = Do TmenPn > 0. (4.53) 


Pm=0 n 


As the derivative dPm/dt is positive at the time Pm becomes zero, the momentary zero 
value of Pm quickly becomes positive again. This reasoning can be used to conclude that 
all probabilities remain positive provided they were initialized correctly att = 0 [0 < 
Pm(0) < 1]. The requirement that all probabilities must remain below the limiting value of 
1 is also satisfied because >, P(t) = 1 must hold at all times, and thus nothing can have 
a value greater than | at any time. If that were the case, one or more probabilities must 
have negative values, which is not allowed. 

The set of master equations is often put into a matrix form for computational purposes. 
As this set contains a first-order, linear differential equation, we can write it in the form 


d 
g PO = AG) IPO) , (4.54) 


where |P(t)) is a column vector with elements (P1, P2,..., Pm) and the matrix A(t) con- 
tains all transition rates. The matrix elements of A(t) can be calculated by equating the 
terms in Eq. (4.54) with those in Eq. (4.51). It is easy to show that 


dP 

a = 5 AmnPn = 5 AmnPn + AmmPm = Sere T AnmPm)s (4.55) 
n nAzém ném 

where we used the relation AmmPm = — aa AnmPm. This relation follows by noting 

that the preceding equation must hold in the special case Pm = dpm, resulting in ae Any = 

0 for all n. By comparing this equation with Eq. (4.51), we obtain the relation 


Amn = Imen — mn > Pen: (4.56) 
m£n 
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When evaluating this expression, it is important to remember that Fan = O for all n. 
Noting that the master equation always has a steady-state solution, it follows that the matrix 
A must have a zero eigenvalue whose eigenvector represents the steady-state solution of 
the master equation. 

It is common to resort to using Eq. (4.54) because one wants to focus on the quantum 
device without getting involved into the reservoir dynamics. The important question is how 
we can find the transition rates between different quantum states of the device. In practice, 
master equations are set up using phenomenological arguments and physically justified 
approximations. An insightful example of this approach is provided by the Pauli master 
equation. It is often used for modeling quantum devices because of its simplicity. It has 
also found applications outside quantum mechanics in areas such as biology and chemical 
kinetics. 

The Pauli master equation considers all possible transitions between the energy states of 
a quantum device. As in Eq. (4.51), the occupation probability P; of the device being in 
state |i) with energy Æ; is written as 


dPi(t) 
dt 


Yo [WPO — iP]. (4.57) 
i#i 


where the transition rate w;; for moving state j to i is calculated using the Fermi Golden 
Rule (see Section 2.2.6): 


2. 
wi = oP | (HAA) PIE; — E). (4.58) 


It assumes that the device Hamiltonian is of the form H = Ho + nH7, where a small 
positive parameter 7 denotes the strength of the perturbation and both Ho and H are time 
independent. The energies E; and Æj are the eigenvalues of the unperturbed Hamiltonian 
Ho (.e., Ho |i) = E; |i)). 

The Pauli master equation (4.57) is like a rate equation and tells us how the occupation 
probability of a specific state increases because of transitions that move the atom into that 
state (gain) or decreases because of transitions that move the atom out of that state (loss). 
It can be shown using the evolution operator, U(t) = exp [- ¿(Ho + nH, Dt], that the first- 
order perturbation term apples only for a relatively short duration [269]. It is necessary to 
include the second-order perturbation term (œ n?) for the device to establish equilibrium. 
The time needed to reach equilibrium is proportional to n~7, if individual interactions are 
described within the Born approximation. This suggests that n must be small and ¢ must be 
large such that the product 77t remains finite. Therefore, we must keep the 77t term in the 
perturbation expansion of the evolution operator but can neglect the terms containing n” t” 
with m Æ 2n. As shown by van Hove [269], the resulting approximate expression for the 
evolution operator has a time dependence consistent with the Pauli master equation [270]. 
The main point to remember is that the Fermi golden rule is valid as long as 7 is small 
but f is large enough that 77+ is finite; this is sometimes referred to as the van Hove limit 
(269, 270]. 
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4.2.2 Reduced Density Matrix 


The usefulness of Eq. (4.57) is limited in practice because it only deals with the occupation 
probabilities that are related to the diagonal elements of a density matrix. The off-diagonal 
elements cannot be ignored if the reservoir-induced decoherence plays a significant role. 
In this section we focus on the technique used to obtain a reduced density matrix from the 
total density matrix of the combined system. 

An important requirement for the master-equation approach is that the quantum device 
should couple to the reservoir weakly to ensure that the reservoir is not affected much 
by the device, and it is possible to employ the Born approximation. However, the device 
itself is profoundly affected by the reservoir and gets entangled with it. Owing to this 
entanglement, the device cannot be described using a “pure state” and requires a mixed- 
state description through a density operator (see Section 2.2.3). Also, owing to a relatively 
large size of the reservoir with a large number of degrees of freedom, the reservoir has 
closely spaced energy levels. In contrast, the device has a single or a few widely separated 
energy levels that interact with the reservoir. This allows one to make use of the Markovian 
approximation. This combination of the Born and Markovian approximations leads to a 
Markovian master equation that can often be solved analytically. Given the assumptions 
made in its derivation, such a master equation has the following properties: (1) it describes 
the dynamics of a quantum device on time scales larger than the correlation time of the 
reservoir; (2) its stationary solution corresponds to the state of thermal equilibrium with its 
reservoir; (3) it can be written in the Lindblad form (see Section 4.3). 

To describe the weak interaction between the device and its reservoir, we invoke standard 
quantum mechanics and discuss how to account for dissipation and decoherence without 
violating its postulates covered in Section 2.2. The approach employs a mathematical tech- 
nique to obtain a “reduced” density operator for the device alone starting from the density 
operator of the whole system (device plus reservoir). Even though the evolution of the total 
density matrix is always unitary, the reduced density matrix of the device can be nonunitary. 

Let pp, Pr, and ppr denote the density operators of the device, the reservoir, and the 
composite system (device plus reservoir). As the composite system is a closed system, its 
density operator ppr(t) evolves as 


por(t) = Upr(t, 0)pprO)UpRtt. 0), (4.59) 


where Up rit, 0) is the evolution operator of the composite system (see Section 2.2.5 for 
details). We can recover pp from ppr by using the operation pp = Trp [ppr], where Trp 
denotes a partial trace over the reservoir. Sometimes this procedure is called tracing out 
over the reservoir; see Aside 4.4 for details. Further progress is made by assuming that there 
is no initial correlation between the device and the reservoir (i.e., opr(O) = pp(0) ® pr(0); 
see Section 2.2.1 for details on the tensor product denoted by @). This is a reasonable 
assumption when the device and reservoir are weakly coupled but may be inappropriate in 
some specific cases [271]. If we also assume that the device has negligible influence on the 
reservoir and use pr(t) © pr(O), we arrive at the Born approximation and obtain 


Por) © ppt) D pr). (4.60) 
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These assumptions can be justified, noting that reservoir represents a very large environ- 
ment relative to the device. Because of its many degrees of freedom and a relatively weak 
coupling to the device, a quantum device cannot influence the reservoir, even though it is 
affected considerably by the reservoir [272, 273, 274]. 


Aside 4.4 Partial Trace and Reduced Density Operators 

Consider two systems, D (a quantum device) and R (a reservoir), with the Hilbert spaces 
Hp and Hp, respectively. Given any two states, |d) € Hp and |r) € Hp, in these systems, 
their tensor product state |d) ® |r) belongs to the composite system Hp ® Hr. We note that 
the composite state |d) ® |r} is sometimes compactly written as |d) |r), |d, r}, or |dr). When 
one of these composite states acts on an operator of the composite state Op ® Or, where 
Op € Hp and Or € Hr, we get 


(Op 8 Or)(d) 8 |r)) = (Op |d)) 8 Or Ir)). (4.61) 


Refer to Sections 2.2.1 and 2.2.3 for details on the tensor product and the density operator, 
respectively. 

Let pp, pr, and ppr denote the density operators of the device, the reservoir, and the 
composite system. The concept of partial trace allows us to recover pp from ppr and we 
denote it as 


Pp = Trrleprl, (4.62) 


where Trp[ ] is the partial trace taken in the state space of the reservoir. As a special case 
of Eq. (4.61), we obtain the relation 


Trr[ld1) (d2| Q |r1) (r2|] = |d1) (d2| Trellri) (r2l]. (4.63) 


Here |d1) (d2| ® |r1) (r2| is an operator in the composite state space, but |d1) (d2| is an 
operator in the device space Hp alone. Also, Trp[|r1) (r2|] is a complex number that can be 
evaluated by noting that trace is invariant under cyclic permutation, giving Tre[|r1) (r2|] = 
(r2|ri). 

For easy reference, we list several key features of the partial trace. Like the standard den- 
sity operator, the reduced density operator pp satisfies the following five properties: (1) 
Hermitian nature; (2) trace invariance, (3) detailed balance, (4) translational invariance, 
and (5) positivity (only nonnegative eigenvalues, to comply with the probability interpre- 
tation). Note that for any operator, O, the relation E (Ô) = Tr [os po()| does not depend 
on the coordinates. 


Suppose the reservoir’s density operator pr(0) corresponds to a mixture of states with 
the probability pa for being in the œ state such that }°, Ppa = 1. We do not assume these 
states to be orthogonal, but each state is normalized such that (Wre|Wra) = 1. The initial 
density operator of the reservoir can be written as a sum over all possible states: 


pr) = È Pa lWRa) (Pral - (4.64) 
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We use this result to calculate the reduced density operator of the device as follows: 


po(t) = Trrlepr()] = TrrlUpr(t, 0)ep(0) ® PROUD RC, 0)] 


a 


= Tir we O)pp(0) ® (Dr. |Yra) nl) Upptt. J (4.65) 


To make further progress, we make use of an orthonormal basis for the reservoir. Let 
the set {[Ørg)} represent this basis and let Zp and Ip be the identity operators in the Hilbert 
spaces of the device and the reservoir, respectively. Noting that trace is cycle-invariant 
under any orthonormal basis, we write pp(t) using the preceding relation as 


po(t) = $ Up ® (brp|)Upr(t, v0) 
B 


@ (J pa Ibra) Weal )Ubr(t Odo ® beep) 
Q 
= > $ Was Olen) ® 11W}, (4.66) 
B a 
where we collected various terms to introduce the operator Wyg(t) as [272, 273, 274] 
Wap (®©) = (Ip ® (rel) Unr(t, 0) (Ip ® VPa |Wra)) - (4.67) 
It is important to check whether Trp[pp(t)] = 1, as expected. It is easy to see that this 
requirement will be satisfied if the double sum S = $, > B wi pt)Wap(t) = Ip: 
Trp [en()] = Trp | X Y Wap er0) 8 DWO) | 
a B 
= Trp | (P00) 8 D X Wi, Wea] 
ap 


= Trp [(ep(0) 8 0p 8 1)] = 1. (4.68) 


We verify the double sum as follows: 


S= pa (Ip ® (Waal) Ube O (1p 8 Ý øre) (rel )UDRG0) p ® Yra) 
a B 


= Xora Up ® (Wral) Ube 0) Up 8 Ir) Upr(t, 9) Up ® |Wre)) 


a 


= È pa (lp 8 (Wral) Up ® \Wra)) = X Pa (lp ® (WralWra)) =n @1. (4.69) 


4.2.3 Markovian Master Equation 


The preceding section has shown that it is possible to define an operator mapping V(t) such 
that 


po(t) = VN pO) = X Wag OLDO) ® IWO, (4.70) 
a,b 
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Even though we treated time f as a fixed parameter up to now, if we allow it to vary keeping 
t > 0, we obtain a one-parameter family of dynamical maps that are completely posi- 
tive and trace preserving. We now show that this family belongs to a semigroup, which 
will allow us to use known results in this area. The semigroup condition requires that 
V(t) V(t2) = V(ti + t2) holds for V(t); this is also known as the Markovian property. To 
prove it, we use Eq. (4.67) and the result Upr(t) + t2,0) = Upr(t1, 0)Upr(h, 0): 


VOD [V(t1)PDO)] = X Wag (2) (> Wagadi) @ nW) @ 1] Wap (02) 
ap a’ ,p! 
=D Walhi + lev) ® IWI +12) = Vi + p)pp). 471) 

a,b 


It is important to note that the constraint of time being positive (t1, t2 > 0) implies that 
we can only propagate the system forward in time (i.e., this dynamical map does not have 
an inverse). This behavior is quite different from the coherent evolution of closed systems, 
where an inverse operation with a negative time argument always exists. Mathematically, 
even though the dynamical map of a closed quantum system forms a group, the corre- 
sponding map for an open quantum system can only form a semigroup. The generator of 
the semigroup is called the Liouvillian £ defined through V(t) = exp(£t). The Liouvillian 
operator is a generalization of the concept of a superoperator. One important consequence 
of this generalization is that the von Neumann entropy [see Eq. (2.50)] defined for our 
open system is not a conserved quantity. As the von Neumann entropy is essentially a 
quantum analog of the well-known Shannon entropy used in communication theory, this 
result suggests that the system loses coherence (or information) as it evolves. We can find 
the Liouville equation using a small time step (Az) such that 


pp(t + At) = V(An)pp(t) = exp(LANpp(t) = (1+ LADpp(t) + O(AP). (4.72) 


If we take the limit At — 0, we arrive at the Liouville equation 


d 
eo = Lop(t). (4.73) 


The final task is to find an explicit form of the Liouvillian £ in terms of the device 
and reservoir parameters. For this purpose, we need to find a basis to represent £ in the 
Liouville space, which has a dimension of N? in contrast with the dimension N of the 
device’s Hilbert space. This is done through a set of N? operators denoted by {F;} with 
i varying from 1 to N? such that Fı = Ip [272]. The inner product for this basis is 
defined as 


1 5 
Fi- Fj = 5 Wolk) Fj, (4.74) 


so that the orthonormality condition, F; - F; = ô;j, holds for any pair of basis elements. 
Using F = Ip, we can show that all other basis elements are traceless: 


1 
FR=5 Troll} Fj] = TrplF]=0 G#D. (4.75) 
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We use the orthonormal basis {F;} to expand the operator Wag (t) as 


N2 
Wag (t) = X [Wap (t) > FilFi. (4.76) 
i=1 


Substitution of this result in Eq. (4.70) gives us 


N2 N2 
VODO =X Y | XOW) FAF; | Lop) ® 11 | XOW RF 
Q B i=1 j=1 
N? N? 
p222 [Wap (t)- Fill Wap(t) - Fil*Filpo(0) ® HF} 
i=l j=l @ 
N2 N2 
=X} ci(OFilep) ® F}, (4.77) 


tl j=1 


where we defined the time-dependent coefficients c;(t) as 


ci) = X Wap) - Fill Wap (0) - Fil". (4.78) 
a $ 


It follows that the matrix c formed using these coefficients is Hermitian (ci; = cj). The 
matrix is also positive because, for any complex vector v, we have the relation [272] 


2 
> 0. (4.79) 


N? N? 


J J civ} y= 


isl j=1 


ails vt) - Wap(t) 


Thus, all eigenvalues of the matrix c are real and positive. We use this papery later. 
As V(t)pp(0) = exp(L£t)pp(0), we can find Lpp(0) by evaluating 4 a pp(0) att = 0. 
Differentiating Eq. (4.77) with respect to ¢ and setting t = 0, we obtain 


N? N? 
=} ) aiFilev© F, (4.80) 


i=1 j=1 


Lpp(0) = 


t=0 
where we introduced the time-independent coefficients aj as 


dej(t) 
OU at 


(4.81) 


t=0 


We can simplify further by using Fı = Jp. Separating the i = j = 1 term from the double 
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sum in the preceding expression, we obtain 


N2 
Lpp(0) =aiIplep(0) 8 Wh + Yai Filon) @ Wp 
i=2 
N? N N? 
+) aryjlplen(0) ® IF; +9 9 ayFilep) 8 NF}. (4.82) 
j=2 i=2 j=2 


We simplify this expression by introducing a new operator F as 


N? N? 
F=) ak,  F=YŅ aur, (4.83) 
i=2 i=2 


where we used the fact that the matrix a = {a,j} is Hermitian because c = {cj} is Hermitian. 
In terms of F, we can write L op(0) as 


LppO) = ari[pp(0) & 1] + Flep(0) ® 1] + [pp(0) & 1]Ft+ 
N2 N2 
Y= >. aijFilep©) ® IF}. (4.84) 


i=2 j=2 


Noting that the trace of the reduced density operator does not change as it evolves in 
time, the relation Trp[Lp(0)] = 0 is satisfied for any pp(0). Using it, we obtain 


Ny N? 
Trp | | anin +F+F* +Y} ask F: | oO =0. (4.85) 
i=2 j=2 
It follows from this equation that 
N? N? 
aulpt+F+F'+)*) ayFÍF; =0. (4.86) 
i=2. j=2 


Using this result, we finally obtain the quantum master equation in a form referred to as 
the “first standard form” [272]: 


dpp(t) 
dt 


where Hr = S(F — F*) and the functional D(X) is defined as 


= —+ (Hr, po(t)1 + D(pp(o)), (4.87) 


N? NP N? NP 
; 1 i i 
DX=Y Y aj (FixF} -F FX) =5 > > ay (Fix, Fi] + [F;, XF; 1) . (4.88) 
i=2 j=2 i=2 j=2 


When D(pp(t)) = 0, Eq. (4.87) reduces to the standard Liouville equation for a closed 
system. Thus, we can immediately identify Hp as the Hamiltonian that enables unitary evo- 
lution of a quantum device when dissipation and decoherence are absent; Hp is referred 
to as the effective Hamiltonian of the device. One effect of the reservoir is to change the 
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original device Hamiltonian Hp to Hp, which results in shifting of energy levels of the 
device (the Lamb shift). The second effect of the reservoir comes through the operator 
D(pp(h)), which introduces decoherence and dissipation because of the device’s inter- 
action with the reservoir. This operator is sometimes referred to as the dissipator in 
literature. We stress that Eq. (4.87) is local in time because it depends on pp at time t, 
but not on pp values at earlier times (the Markovian property). Even though the deriva- 
tion of this equation was presented systematically in 1976 in Refs. [275, 276], similar 
equations appeared earlier in the context of spin relaxation dynamics and laser theory 
[277, 278, 279]. 


4.3 Lindblad Equation 


The Lindblad equation [276] is a variant of the Markovian quantum master equation 
derived in Section 4.2. Even though it bears Lindblad’s name, equivalent results were 
obtained by Gorini et al. [275, 280]. The Lindblad equation makes use of the concept 
of positive mapping that was first introduced by Stinespring [281] and later studied by 
Kraus [282]. 


4.3.1 Derivation of the Lindblad Equation 


The use of the orthonormal basis {Fj} in Section 4.2.3 led to the coefficients cj in Eq. 
(4.78) that were used to define aj in Eq. (4.81). However, even though the N 2 x N? matrix 
of aj; coefficients is Hermitian, it is not necessarily diagonal in the {F;} basis. Clearly, the 
dissipator will have a less complicated structure in a basis in which the coefficients aj 
form a diagonal matrix. To realize this simplification, we diagonalize the a matrix using 
a unitary matrix U such that a = UDU*, where D is a diagonal matrix with the ele- 
ments (Ya 13, Vreis Yan2)- These eigenvalues are real and positive because the matrix a is 
Hermitian and positive definite. We use the matrix U to introduce a new set of operators 
Ax as 


N2 
F; = > Utir, (4.89) 
k=2 
where k = 2,3,...,N?; Fı is not expanded because F = Jp. Writing the matrix relation 


a = UDU* in its explicit form, we have 


N? N? 


aij = 5 5 UipDpqUjg = x Yap Vip U. (4.90) 
p=1q=1 P 
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We substitute Fi and aj in Eq. (4.87) to simplify the dissipator D(pp()): 


N? N? N? 

D(pp(t)) = SDD [S mao, Salu i] 
i=2 p=2 j=2 
1% N? N? 


+= 59 XOY vale up| 2 UniAk, pot) Salut] (4.91) 


i=2 p=2 j=2 
This equation can be simplified considerably because U is a unitary matrix. The final result 
is given by 


N2 


1 
5 È var (TAD, A$] + LAr poA). (4.92) 
k=2 


D(pp(t)) = 


When we substitute this result in Eq. (4.87), we obtain the Lindblad equation [276] 


dpp(t) i 1x ; 
A = glr O+ 5 2 Yak (Anon Aj] + [Ar pvAj])- (493) 


Even though more general evolution equations are conceivable using the Kraus map 
[283], the Lindblad equation is the most widely used quantum master equation. It has 
the following properties: (1) It is local in time (Markovian property), (2) has constant 
coefficients, and (3) preserves the Hermitian and positivity properties of the density matrix. 
The A, operators are referred to as the Lindblad operators, or as jump operators of the 
system. It is important to note that all coefficients yag represent rates (unit s7!) because Ax 
operators are dimensionless. In practice, these rates correspond to the relaxation rates of 
the device. 

Apart from quantum optics and quantum computing, the Lindblad equation has found 
applications in other branches of physics and chemistry. For example, it has been used 
to model the effect of environment on a ultrafast predissociation process [284]. It has 
also been used to model the dynamics of laser-induced nonthermal desorption of neutral 
molecules from metal surfaces [285]. In nuclear physics, a semigroup formalism similar 
to that used for the Lindblad equation is applied to model giant resonances in the nuclear 
spectra above the neutron-emission threshold [286]. 

It is also useful to consider how the Lindblad formalism relates to the Pauli master equa- 
tion in Eq. (4.57). For this purpose, we first note that the diagonal elements of the density 
matrix correspond to populations of the associated energy levels. Second, we choose a 
basis {|m)} of the energy eigenstates by diagonalizing the Hamiltonian of the device. The 
Lindblad equation for the diagonal elements in this basis takes the form 


d 
q PD mm?) = 3 C S IAP nm, COD) | 
jm 
= =>) (Aj Jal [Cod nin; — (eD)mm| > (4.94) 
jam 
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where we assumed that the Lindblad operator A; couples the state |m) only to the state |n;). 
We now make the assignment (0p)mm(t) —> P(t). The resulting equation can be written 
in the form of the Pauli master equation as 


T P(t) = E [w PO- WinPn)] (4.95) 
jem 

where Wj = D jám |(Aj)mnj | Oren The important point to note is that this equation is based 
solely on the diagonal elements of the density matrix. It may be valid in some limiting 
cases, but off-diagonal matrix elements of the density matrix also need to be considered in 
the quantum-coherence regime. Even though one may circumvent these elements by adopt- 
ing a formulation of coherent dynamics based on the diagonal elements, such an approach 
is nonintuitive and computationally hard because one needs to deal with the nonlocal time 
derivatives [273]. 

Because the Lindblad equation is a linear equation, it can be written in the form 
deo) = Lpp(t). However, this format does not allow one to make use of readily available 
computational techniques because the density operator pp(t) is in the form of a matrix. 
Most computational schemes are designed for equations of the form dV/dt = MV, where 
V is a vector and M is a matrix. In practice, it is required to map the density operator 
pq(t) to a column vector using a strategy known as vec-ing [287, 288]. In some cases, the 
Lindblad equation can be solved analytically. The simplest example of such a dissipative 
quantum system is a damped harmonic oscillator. We discuss it next because this example 
provides insight that is lost in describing complex systems. 


4.3.2 A Damped Harmonic Oscillator 


Consider a harmonic oscillator oscillating at frequency wo. The Hamiltonian of an 
undamped harmonic oscillator has the form 


Hp = hao(a'a + 1/2), (4.96) 


where â and ât are the annihilation and creation operators that satisfy the commutation 
relation [a,a"] = 1. To account for dissipation induced by the oscillator’s coupling to 
a reservoir, we invoke the Lindblad equation given in Eq. (4.93). Given that the Hilbert 
space is two-dimensional (N = 2), the Liouville space has dimensions N? = 4. Here we 
consider only three Lindblad operators and identify them as {Aj = Ip, A2 = a,A3 = ât}, 
where Ip is the identity element. Introducing two positive eigenvalues ya2 and ya3, we can 
write the Lindblad equation in the form 


dpp one PRs 1 ` ms r z 
T2 = ioâ à, ppl + zvaz (Tâp), â*] + lâ, pp(a"l) 


+ iya (lpo âl + ât, ooa). (497) 
2 a3 PD -4 a >» PD a . . 
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It follows from the cyclic-invariance property of the trace operator that Tr[op(t)] maintains 
its initial value, as it should, because the trace of each commutator on the right side of the 
preceding equation is zero. 

To find the equation of motion for averaged quantities, we use the identity Tr[A[B, C]] = 
Tr[[A, B]C] that can be proven using the cyclic-invariance property of trace. Using (a(t)) = 
Tr[o(f)a], we obtain 


d dp. T 1 n 
il Pal = io (â) — 5a — Yaa) Â) . (4.98) 


We can integrate this equation to find the following analytical solution: 


1 
(a(t) = exp [ior -= za = yan] (a(0)) . (4.99) 


This result shows that, as long as yg2 > a3, the amplitude of a quantum harmonic oscil- 
lator is damped in the same way as expected for a classical oscillator. It is important to 
emphasize that this decay occurs for the expectation value of the & operator, which itself 
cannot decay because that would violate the commutation [å, ât] = 1. If ya3 > Ya, the 
preceding equation describes a quantum amplifier. However, the saturation effects must be 
included because an exponential growth cannot continue forever. 

Another quantity of interest is (Np) = Tr[a' app(t)], representing average population of 
the harmonic oscillator. Using the same procedure, we obtain the following rate equation: 


d 

P7 (Np) = —Ya2 (Np) + Ya3( (Np) + 1). (4.100) 
It follows from this equation that, when ya2 > Ya3 and the harmonic oscillator is damped, 
the population reaches a steady state value of 


Ya3 


(Np) .¢ = ——————.- 
PUSS Yan = Yad 


(4.101) 


If the harmonic oscillator is in thermal equilibrium, its average energy, Egy = (Np) „ ho, 
can be calculated from the Boltzmann distribution at a specific temperature T. Using 


ho 
Eav = (Np), ho = r (4.102) 
exp (2) -1 
we obtain the relation 
h Np) + 1 
re exp ( 2.) EE O FP (4.103) 
Ya3 kBT (Np) ss 


Here (Np), is the number of thermal photons at the frequency w, calculated using the 
Planck distribution. As both ya2 and ya3 are temperature dependent, we can introduce a 
positive real number yo(T). In terms of this quantity, ya2 = yo(T)L{Np),, + 1] and ya3 = 
volT) (Np) ss 
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4.3.3 A Damped Two-Level Atom 


In this section we apply the Lindblad equation to a two-level atom to see how its coupling 
to a reservoir leads to its damping. We have considered this problem from a different angle 
when we used the Weisskopf—Wigner theory to derive the rate of spontaneous emission for 
an excited atom. We adopt the notation used in Section 4.1.4 and make use of Pauli spin 
operators introduced in Aside 4.3. 

As before, the two-level atom has a ground state |g) with energy E, and an excited state 
|e) with energy Ee. The energy difference between these states is written as Ee—Eg = haeg. 
The device Hamiltonian takes the form Hp = 5hwed3. The Pauli operators acting on this 
Hamiltonian either reduce or increase energy by weg because [Hp, o+] = Ehapo+. The 


appropriate basis in the Liouville space for this system is the set {Aj = Ip, A2 = 03,A3 = 
o_,A4 = o4}, where Ip is the identity element. Therefore, we introduce three positive 
eigenvalues {Vq2, Ya3, Ya4} and obtain the Lindblad equation (4.93) in the following form: 


dpp : 1 
a S —iweg[03, ep(t)] + za (lo3pp(t), 03] + [03, pp(t)o3]) 
1 1 
+5 %a3 ([o_ ep(t), o+] + [o_, pp(t)o+]) + 5 Ya ([o4ep(t), o-] + [o+, ep(to_]). 


(4.104) 


This equation can be simplified using the commutation relations given in Aside 4.3 to 
obtain 


d 
TP = — iwveglos, PpO] + yea losen(0o3 — poi) 


1 
+ z8 [20- Pp) — 0+0- pp(t) — pp(o+o_] 


1 
+ as [20+ P00- — 0-04 ppl) — pp(o-o+] (4.105) 


It is possible to make connections with the analysis for a damped harmonic oscillator 
(in Section 4.3.2) by identifying @ —> ø— and ât —> c4. If we discard the ya2 term 
leading to the Lamb shift in Eq. (4.105), we can use Eq. (4.102) and write the remaining 
decay rates as ya3 = Yel(Np),, + 1], and Yaa = ye (Np),,. Where ye is a constant that may 
depend on temperature and (Np), is the number of thermal photons at the frequency weg 
(calculated using the Planck distribution). If we now compare these coefficients with the 
Weisskopf—Wigner theory of spontaneous emission in Section 4.1.3, we find that ye is just 


the spontaneous-emission rate given by 


3 2 
Wool (eldig) | 
= 4.106 
Ve a veut eae) 
We stress that ya3 = Yel(Np),, + 1] contains two emission terms: ye (Np) ,, representing 
the rate of stimulated emission and ye representing the rate of spontaneous emission. In 
contrast, Ya4 is responsible for absorption. 
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4.4 Redfield Equation 


As we have seen in the preceding section, the Lindblad equation provides a way to track 
evolution of the density matrix of a device coupled to a reservoir. However, the map 
describing this evolution is not always unitary. Indeed, it is this nonunitary feature that 
introduces dissipation and decoherence in the standard formulation of quantum mechan- 
ics. The Krauss operator-sum representation provides such a map, although the resulting 
description is not unique [283, 282]. We can view the Redfield equation [289] as a gen- 
eralization of the Lindblad equation because the latter can be derived from the former. 
However, this process cannot be reversed because not all resonant interactions with the 
environment are retained in the so-called secular approximation that converts the Red- 
field equation into the Lindblad equation. What is very important to note is that both 
the Redfield and Lindblad equations are Markovian master equations that are valid only 
for quantum devices that are weakly coupled to a large reservoir. Even though the Red- 
field equation is trace-preserving (just like the Lindblad equation), it does not guarantee a 
positive time-evolution of the density matrix, which opens the possibility for having neg- 
ative populations (clearly an unphysical situation). In spite of this, the Redfield equation 
converges asymptotically to the thermal equilibrium distribution set by the reservoir. 


4.4.1 Derivation of the Redfield Equation 


We again consider a quantum device, coupled weakly to a large reservoir with the total 
Hamiltonian Hpr = Hp + Hr + H7, where Hy is the interaction part. As the combination 
D + Ris a closed system, its density matrix ppr(t) obeys the standard Liouville equation 


j 
£ port) =—~—[Hpe: ppe). (4.107) 


dt h 


It is useful to transform this equation to the interaction picture. For this transformation, we 
adopt the notation of Section 2.2.5 and represent the operators in the interaction picture 
with a tilde on top. In terms of the evolution operator Upr(t) = exp[— 7 (Ab + Hp)t], we 
have 


ppr(t) = Upr()PprOUpp), HO = Upr(DH) (HU) RO. (4.108) 


The resulting Liouville equation in the interaction picture has the form 


d TENS 
(bor) = — (Ano, Ol (4.109) 


This equation can be solved implicitly to obtain 


> pt 
Ppr(t) = Ppr(0) — = [ LAI), PoR] dt’, (4.110) 


where Ppr(0) is the initial value of the density operator. It is not possible to find an exact 
solution of this integral equation. However, it can be solved approximately as a series 


150 


Dissipation and Decoherence 


expansion by replacing Ppa(ft) on the right side with its value at t = 0, and repeating this 
procedure multiple times. 

As our focus is on the quantum device, in Eq. (4.109) we trace over the reservoir to 
obtain the reduced density operator of the device. Using a series expansion and retaining 
terms up to second order, the final result is given by 


d ~ j ~ ~ 1 , Ty Erri & / / 
Teo) = —= TrelFi(0), Por) — zi TrrlH7(), (H(t), Ppr dt. (4.111) 
t h h 0 


As before, it is reasonable to assume that no initial correlation exists between the device and 
the reservoir (i.e., Opr(O) = pp(0)@ pr(0)). Although his assumption may be unreasonable 
in some specific situations [271], we assume that it holds for our quantum device. If we 
further assume that the evolution of the device has negligible influence on the reservoir 
(pr(t) ~ pr(O)), we arrive at the Born approximation: Ppr(t) ~ Pp(t) @ prR(0). As a result, 
the composite system remains in an approximate product state at all times, and temporal 
changes in the density matrix of the environment can be neglected. This is justified by 
noting that the reservoir represents a large environment with many degrees of freedom, 
and it remains in thermal equilibrium because it interacts with the device weakly [272, 
273, 274]. Under these conditions, 


TrrlAi(2), ADRO] = TrriAIC), pp(0) ® prO] —> 0, (4.112) 


because when taking the trace over the reservoir using the eigenstates of Hr, the commu- 
tator bracket vanishes owing to the diagonal representation of pr(0). As trace is invariant, 
the above result holds for any other basis for the reservoir as well. 

With these simplifications, Eq. (4.111) reduces to 


d 1 ff ~ ~ 
gO = -5 | Tra[Hi(2), LHI), PD) ® prO] dr. (4.113) 


This is still an integro-differential equation and it cannot be solved easily. To simplify it, we 
make use of the Markovian approximation, which amounts to assuming that the quantum 
device has no memory of its past. We thus replace pp(t’) with pp(t) in Eq. (4.113) and 
obtain the Redfield equation in the form [289, 272] 


d es 1 f = TT Wy / 
gr = =a, Trrl H(A), [HC ), PDC) @ prCO)]] de. (4.114) 


The Redfield equation is useful in quantum optics but has found applications in many other 
fields including magnetic resonance [290, 291] and optical spectroscopy [292]. 

Even though the Redfield equation is local in time for the reduced density matrix, it is 
still not fully Markovian as it contains an implicit dependence on the initial value at t = 0 
of the reduced density operator. By making the substitution f —> t — tı and extending the 
upper integration limit to infinity, we obtain a fully Markovian master equation in the form 


da 1 f” ~ ~ = 
eo a Trrl HIC), [HG — t1), Pp) ® pr(O)]] dti. (4.115) 
-Jo 
This equation is valid for reservoirs with a “short memory” and is justified when the time 
scale over which the quantum device evolves can be considered large compared to the 
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time scale over which correlations in the reservoir decay. As a result, the Markovian ver- 
sion of the Redfield equation cannot resolve device dynamics over time scales comparable 
to or shorter than the correlation time of the reservoir. Thus, even though the Markovian 
Redfield equation is a differential equation, it can only describe device evolution over a 
coarse-grained time scale [272]. Also, there is no guarantee that Eq. (4.115) describes a 
generator of a dynamical semigroup [293, 294, 272]. Another issue is that Eq. (4.115) con- 
tains rapidly oscillating terms that are problematic for its numerical implementation. A way 
to remove these rapidly oscillating terms is provided by the rotating-wave approximation 
that we consider next. 


4.4.2 Rotating-Wave Approximation 


To implement the rotating-wave approximation, we write the interaction Hamiltonian Hy 
in the form [272]: 


Hr =} Ya 8 Aa, (4.116) 
a 


where the Hermitian operators Y, and Ay act respectively on the quantum device and the 
reservoir. We denote eigenvalues of the Hamiltonian Hp of the quantum device by €, and 
the corresponding eigenstates by |€). For any two eigenstates such that €’ — € = fiw, we 
define a new operator at the frequency w as 


Yoo) = J be—e,ho l€) (€l Tale’) (e'l. (4.117) 


€,€/ 


We calculate the commutator [Hp, Ty(w)] as 


[Hp, Ta(o)] = J Sere,us€ l€) (el Tale’) (e'l — Y ber—e,hw le) (el Yale’) €! (e'l 


€,€' €,€! 


= (€ — €’) Y 8e'—e,ħo le) (el Tale’) (e/| = —ħo Talo). (4.118) 


E 


Similarly, we can show that [Hp, Yi (w)] = hoy) (œ). It follows that the relation Yi (w) = 
Ya(—q@) holds. We can also show that [Hp, Yi (@)Yq(@)] = 0. As the energy eigenstates 
form a complete basis in the associated Hilbert space, the sum over all energy levels such 
that €’ — € = hw amounts to summing over all frequencies. Thus, the original operator can 
be written as 


Ya = X Talo). (4.119) 


As the Redfield equation (4.115) is written in the interaction picture, we need to write 
the operator Y, in the interaction picture using 


Yy = exp ($r) (£ rato) exp (- Ho) : (4.120) 
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We make use of the Baker—Hausdorff formula for any two operators A and B: 


j2 72 
exp(iBt)A exp(—iBt) = A + it[B, A] + (=) (Bi (RAN +... (4.121) 
Using it, we find the relations 
Ya = DS exp(—iot) Ya (w), Yi = > exp(iot) Yi (w). (4.122) 
oO w 


Combining these results, the interaction Hamiltonian can be written as 


H,(t) = 2 >; exp(—iot)Yg(w) D Aalt), (4.123) 


a w 


where Àu (t) is given by 
Kalt) = exp ($r) Ag exp (- 5H") (4.124) 


It follows from Eq. (4.112) that the average value of q(t) is zero. 
Before using these results in Eq. (4.115), we rewrite them in the following equivalent 
form: 


d 1? te ne és 
Co) = a f T[i- Hote @ oto 


— AOA- 1)pp(t) ® pr(0) | dt +H, (4.125) 


where H.C denotes the Hermitian conjugate of the previous expression. Substituting the 
interaction Hamiltonian from Eq. (4.123) and simplifying by rearranging terms, we obtain 


“Pott =>) Dd Teo expli(o’ — ot] 


aa’ wo! 


x (Tw OTH!) — THO wn) + H.C, (4.126) 
where we have introduced 
1 o TTAN / - i / 
aa! (0) = 55 Í Trg [KOR z ror) | exp(iot’) dt’. (4.127) 
0 


The trace over reservoir represents an averaging procedure. Thus, we can write loa’ (œ) in 
terms of the reservoir’s correlation function as 


Teg'(@) = 5 [ 7 (KIOK (t — t) exp(iot’) dt’. (4.128) 


Because the reservoir is large and remains in thermal equilibrium ([Hpr, pr(O)] = 0), the 
correlation function depends only on the time difference [272]: 


(Ab (QAg(t— 1) = (AL(f)Ag(0)). (4.129) 
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Equation (4.126) is still too complicated. We can simplify it by noting that the term 
exp[i(w’ — w)t] oscillates rapidly when w 4 w’ and t >> (œ — w)~!. We can eliminate such 
terms by making the rotating-wave approximation and setting w! = w, resulting in 


d a n 2 
qo) = > Pea'() | Tap OTH() — TOTA] +H.C. (4.130) 


To simplify it further, we introduce the Fourier transform of the reservoir’s correlation 
function as 


Lf eee oe 
Vaa (@) = z f (ALOA w(t —1)) expliot) dt, (4.131) 
—CoO 
and note that Two’ (œ) is related to this quantity as [272] 


1 
Paw'(@) = 5 Yaa'(@) + Saa'(@). (4.132) 


Using these results in Eq. (4.130), we obtain the Redfield equation in the rotating-wave 
approximation: 


d j Š ” 
< bolt) = -+ lHrs, PO + P(o), (4.133) 
where the dissipator is defined as 
~ 1 ` ~ 
DPO) = X X rar (o) (ITOP, THON] + [Ya DOTON). (4134 


The Hamiltonian Hzs in Eq. (4.133) is called the Lamb-shift Hamiltonian because it causes 
a Lamb-type shift of the unperturbed energy levels. It is defined as 


His =A) Y Sew TLO Two). (4.135) 


aa’ O 


It is easy to show that the Lamb-shift Hamiltonian commutes with the device Hamiltonian 
(.e., [Hzs, Hp] = 0). 


4.5 Quantum-Optics Master Equation 


A typical situation found in quantum optics is that many electromagnetic modes of the 
reservoir interact weakly with a quantum device. These interactions enable the flow of 
energy from the device to these modes. However, owing to the weak nature of the inter- 
action, it takes much longer for energy to flow back to the device once it has left. The 
reason for this behavior is related to different Rabi frequencies associated with different 
radiation modes. In essence, because of a continuum of radiation modes, energy transfer 
to the reservoir is an irreversible process that destroys the coherence of the atomic state. It 
is important to stress that this energy transfer satisfies well both the Markovian and Born 
approximations [295, 296, 297]. Thus, we can apply the Markovian version of the Redfield 
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equation (4.133) to describe this situation and understand how a quantum device interacts 
with the radiation modes. The resulting equation is the quantum-optics master equation. 
We use the dipole approximation to write the interaction Hamiltonian as 


3 
H; =—d-E= -J dE, (4.136) 
a=1 


where a = 1,2,3 represent three Cartesian coordinates, d is the dipole-moment opera- 
tor, and E is the interacting electric field, assumed to be a plane wave polarized along 
the direction ps. This unit vector is always orthogonal to the propagation vector k (i.e., 
k - ps = 0). The frequency of the plane wave is given by œk = ck. The quantized form of 
the electric field is given by 


hag at 2 mm 
E= 3 bay (aj, + âxs) Ds, (4.137) 
S 


where V is the reservoir volume. The total Hamiltonian of the device plus reservoir is thus 
given by 
H= Hp + } | herd, ans — }_ da ® Es, (4.138) 


k,s a 


where Hp is device Hamiltonian and [âkķs, ài, yl = ôkk'ôss'- 

We assume that the electromagnetic radiation interacting with the quantum device is in 
thermal equilibrium. In this situation, the density operator of a mode with frequency œķj 
has the form 


1 CO 
)= 5 -ny (hog; /kBT)) Iny) (n 4.1 
PR(@K) 1 exp(hay ea) l nyj(ħoærj/kBT)] Iny) (nl, (4.139) 


where ng; represents the number of photons in the state |n}. Using this expression, we can 
write the equilibrium density operator pg(0) in Eq. (4.115) as 


PR(O) = PR(@k1) 8 PRC@k2) Q ... @ OR(@Okm) @ .. - « (4.140) 


The utility of this expression is that we can we can calculate averages using the relation 
(O) = Tr[Opr(0)], where O is any operator. It is easy to show that 


(xsd) =0, (â$ ât) = 0, (4.141) 
(âksâf y) = ôks,k's [1 + Ngr(@k)], (4.142) 
(â$ Gus’) = ôksws Ner (0p), (4.143) 
where the Bose—Einstein occupation number is given by 
1 
N, = 4.144 
Bik) = To (hay [ks ae 


If the equilibrium condition of the reservoir is different (a squeezed state rather than ther- 
mal equilibrium), the correlation averages may have different values, but the following 
analysis remains valid. 
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We can use the preceding results to calculate the most important quantity in the Redfield 
equation (4.133), namely the reservoir correlation function Twa’ (œ). Using Eq. (4.127), it 
can be written as 


a) a 5 [ 7 (EL QEy(t — t) exp(iot’) dt’. (4.145) 


Substituting E from Eq. (4.137), we obtain 


1 of | ho, | hop. n RI 
Tew (©) = I j! D 2 eV Dey y PPs exp(iwt )x 


( (af sapy) expliort + tou (t = 1)] + Castine’) expl-iort — ioy (t — YH 
(Giesdy,.) exp —icogt + i@y (t — r] + (Gj, a's’) expliogt — iww (t — D) dt’. (4.146) 


If we now apply the averages given in Eq. (4.141) and note that Psa Psa’ = (Psa)? Sea’ Sss"s 
we obtain 


hax |. 
Pea'(@) = P aay (ol Saa’Ssy' (L1 + Nero] 


x f exp[—il@k — o) ] dt’ + Ngr(œp expla, + @)t'] dt). (4.147) 
0 


Finally, using the Sokhotski—Plemelj formula given in Aside 3.3, Tyo’ (œ) can be written 
in the form 


Tow (@) = Saya! (Fr) + is) > (4.148) 


where y(w) = (w? (pa)? /37 eoħc?)b(œ) and b(w) is defined as 


14N ; >0 
ea, oe Oz (4.149) 
—Npg(—@), œ <0. 
The corresponding Lamb shift of energy levels is given by 
(Pa)? [E (EE y e) ao 
S(@) = —> RPV. dw. 4.150 
(o) 6r? ħegc? 0 w — «a! F ota)” % ( ) 


When these quantities are used in Eq. (4.133), the resulting equation is called the quantum- 
optics master equation. 


Quantum Current Flow 


What we observe is not nature itself, but nature exposed to our method of questioning. 
Werner Heisenberg 


5.1 Quantum Transport 
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Ohm’s law is an empirical law that is known to hold across a wide range of length scales. It 
states that the local current density of a material is proportional to the electric field strength 
at that point (J = o E). The proportionality constant ø is known as the conductivity of the 
material, and it can even be defined using the same exact ratio when this law breaks down. 
Indeed, Ohm’s law is known to break down when the electric field is very strong or very 
weak. As a result, it is natural to assume that it may not hold for nanoscale conductors 
where the discrete nature of electric charges cannot be ignored, but experiments have not 
met this expectation. It was observed in 2012 that Ohm’s law holds for silicon wires as 
small as four atoms wide and one atom high [298]. What this observation tells us is that we 
need to revise our understanding of conductivity, especially at the nanoscale, and develop 
a more accurate model of charge transport in nanoscale conductors. In this chapter, we 
focus on different ways of characterizing charge transfer through nanostructures, taking 
into account the quantum nature of charge carriers and that of the surrounding medium. 

Historically, charge transport in conductors was studied by Drude and Lorentz. The 
Drude theory predicts that the conductivity of a metal is given by o = N.q2t,/me, where 
Ne is the density of electrons and tT, is the mean collision time of electrons related to their 
mean free path. In this model, conductivity of a metal is calculated by extending the local 
conductivity associated with an infinitesimal length of the conductor to a finite size using 
the well-established machinery of calculus. The model also assumes that charge transport 
through a conductor is essentially diffusive in nature with isotropic relaxation. 

It was realized in the 1950s that the Drude theory has a limited validity for quantum 
devices with nanoscale conductive channels. Rather than integrating local conductivity 
over the size of a conductor, another approach emerged where the conductance of the 
whole channel was found directly. This approach is known as the scattering method. The 
advantage of this method is that both the coherent and incoherent processes occurring 
within a quantum device can be accounted for when charge carriers are driven far from 
thermal equilibrium. By coherent processes we mean processes such as tunneling and bal- 
listic transport; incoherent processes include scattering via phonons and charge transport 
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via hopping mechanism (in inorganic semiconductors). The scattering method also aban- 
dons the notion that charge carriers behave like classical particles inside a conductor and 
enable one to invoke the full machinery of quantum mechanics. 

In electromagnetic theory, the “current” (measured in amperes) is defined as the amount 
of charge (in Coulombs) transferred through a given cross section of a conductor per unit 
time (measured in seconds). This definition was also officially adopted by NIST (National 
Institute of Standards and Technology, USA) in May 20, 2019, along with three other SI 
base units: the kilogram (mass), kelvin (temperature), and mole (amount of substance). One 
promising way to quantify current with high precision makes use of a nanoscale technique 
called single-electron transport (SET) pumping [299]. This technique is used at NIST for 
measuring currents. It involves applying a gate voltage to a transistor-like device, which 
ejects one electron through a high-resistance tunneling junction into a quantum island made 
using a microscopic quantum dot. As discussed in Aside 5.1, this electron is removed from 
the island via another tunneling junction. 


Aside 5.1 Single-Electron Transport (SET) Pumping 


A highly precise current source can be built using a SET pump. As shown in Figure 5.1, 
it is constructed by connecting two tunneling junctions and a gate electrode to a quantum 
island capable of holding single electrons. The structure resembles a metal-oxide semicon- 
ductor field-effect transistor (MOSFET) used in conventional electronic circuits. The gate 
electrode controls flow of electrons from the source to the drain. However, these electrons 
need to cross two tunneling junctions that isolate the source and the gate from the quantum 
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(a) Schematic diagram of a single-electron transistor (SET). Current flows from the source to the drain, as in a 
conventional MOSFET. (b) Schematic showing how the SET pumps individual electrons. 
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island in the middle. The quantum island plays a key role in passing single electrons from 
the source to the drain. The principle exploited is called the Coulomb blockade effect, a 
well-understood phenomenon discussed in Section 1.5. This effect can be understood by 
noting that, whenever an electron is tunneled to the quantum island, the electrostatic energy 
of the system increases by an amount Ec given by 

Ge 


Ec = , (5.1) 
= 2CseET 


where Cser = Cs + Cp + Cg is the total capacitance of the device, Cs, Cp, and Cg being 
capacitances between the quantum island and the source, the drain, or the gate, respectively. 
Often, Ec is referred to as the charging energy because it is precisely the energy gained 
with the injection of a single electron. 


The tunneling of an electron can only occur if an energy greater than Ec is externally 
supplied to the device. However, if thermal energy KgT is larger than Ec, the process is 
hampered by the operating temperature of the device. The charging energy can be increased 
beyond thermal energy by reducing the total capacitance Csgr, which amounts to reducing 
the physical dimensions of tunneling junctions to near 100 nm. The characteristic time to 
tunnel an electron through such a tunnel junction is given by tr = CrRr, where Cr can be 


z PERA : 4 h 
Cs, Cp, or Cg. However, the uncertainty principle induces an energy uncertainty of z tO 


the required charging energy. To guarantee tunneling events, we need to ensure Ec >> a: 


This condition requires that the resistance of the tunnel junction be large enough to satisfy 
Rr > h/qz. The ratio h/q2 is known as the von Klitzing constant; it has a numerical value 
of 25.9 KQ. 


If all the preceding conditions are met, it is possible to operate the SET pump by sweeping 
the gate voltage in such a way that the electrical current is generated through clocked 
transport of individual electrons. If the gate is operated at a frequency fg, the generated 
current Isp can be written as Isp = qefg, which follows straight from the definition of 
current in a conductor. Owing to recent advances in this field, SET pumps can generate 
currents ~100 pA with a relative uncertainty of 1076 or better. 


Owing to the adopted definition of current by NIST, the measurement of current becomes 
a matter of counting individual electrons over a certain time interval. However, the classi- 
cal picture of an electron as a point particle loses its validity when its quantum nature is 
taken into account. The paradox of electrical current is that, even though charges are quan- 
tized, their transfer rate can have practically any value, even a fraction of the charge of a 
single electron per second; that is, the current, unlike charges, is not quantized. This can be 
understood by noting that electric current through a conductor is merely a displacement of 
the electron cloud against the lattice of positive cores. As this displacement can be by any 
amount, the associated electrical current is a continuously varying quantity. To calculate 
the electric current, we use the concept of current density in quantum mechanics. If the 
motion of a quantized charge carrier of mass me and charge qe is governed by the wave 
function Ye, the resulting current density is given by 
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where S[. ..] stands for the imaginary part. The conductance of a quantum channel can be 
used to characterize how this current interacts with other parts of the quantum system. 


J=-i S [uVe], (5.2) 


According to the semiclassical theory, there are no fundamental restrictions on the con- 
ductivity of a material, and it can have all continuous values in a range that depends on the 
intrinsic properties of a specific material. However, following the discovery of the quantum 
Hall effect, it became clear that conductance is not a continuous variable but changes in 
steps of a basic quantum unit, the so-called conductance quantum given by 


Go = @/h. (5.3) 


The existence of the conductance quantum was first observed in a 2D electron gas formed 
between GaAs and AlGaAs semiconducting layers [59, 60]. This discovery opened up a 
new area of research on quantum localization. A very useful relationship, due to Landauer, 
states that conductance of any material can be calculated by multiplying the conductance 
quantum Gg with the quantum-mechanical transmission coefficient of that material. The 
important question that seems like a paradox is how dissipation occurs as a result of current 
flowing through a quantum conductor. Whenever a current J flows through a material with 
conductance G, it also dissipates energy at the rate /7/G. As elastic scattering is the only 
process responsible for the appearance of the conductance, there is no dynamical mecha- 
nism that can cause energy dissipation. The answer comes from the appearance of a contact 
resistance when a quantum conductor is connected to a reservoir. The reasoning is based 
on the fluctuation-dissipation theorem discussed in Section 3.3 that predicts the behavior 
of systems obeying detailed balance; electrical resistances in quantum systems are also 
covered by this theorem. 

A typical quantum transport system is depicted in Figure 5.2. It consists of a nanoscale 
quantum device that can either hold or transport charge carriers to the electrodes (which 
could be more than two in multilead scenarios). If an electrode supplies charge carriers, 
it is called a “source.” If an electrode removes charge carriers, it is termed a “drain.” 
Depending on details of the sources, drains, and the quantum device, such a system can 
be analyzed using several approaches differing in details of how the dynamics of charge 
carriers through the quantum device is treated. The first approach we consider is known 
as the Landauer-—Biittiker method, and it provides a simple and intuitive description for 
the majority of quantum systems found in practice. In this approach, charge dynamics is 
considered as ballistic transport (pure elastic scattering) near thermal equilibrium. This 
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A typical quantum transport setup where a quantum device is connected to two electrodes. 
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method completely obscures the quantum features of the device and treats it like a poten- 
tial barrier, which is represented by a scattering matrix. Such a passive approach may not 
be suitable in cases where the quantum device (1) is nonlinear and its response covers a 
broad frequency range, (2) functions far from equilibrium with a high source-drain voltage, 
or (3) contains charge carriers that interact with each other as they move inside the device. 
To handle such scenarios, it is common to invoke transport theory, especially tailored for 
interacting particles under nonequilibrium conditions. The underlying method is known as 
the nonequilibrium Green’s function method. 


5.2 Landauer—Bittiker Method 
a aaa aaa 


The Landauer-Biittiker method maps quantum transport to an equivalent scattering prob- 
lem and establishes a relation between the scattering amplitude of a charge carrier in a 
quantum device and its conducting properties (see Fig. 5.3). The beginnings of this method 
can be traced back to the pioneering work in 1957 by Rolf Landauer [63, 64], who heuris- 
tically derived an expression for the electric current using an approach based on scattering 
theory. This work was refined in 1986 by Biittiker, who laid the foundation of the formal- 
ism presented here [300]. This method is general to the extent that it can be applied to any 
quantum channel containing noninteracting charge carriers. The absence of inelastic scat- 
tering can be relaxed, provided the mean-field description remains valid. In this method, 
interactions among charge carriers are accounted for through changes in the charge distri- 
bution that modify the scattering potential. This type of charge transport is called coherent 
because quantum coherence properties are preserved across the scattering region. 


5.2.1 Scattering Matrix Representation 


The scattering matrix S provides all the information needed to find the outputs at the ports 
of a quantum device, given any inputs at these ports (see Fig. 5.3). This matrix takes into 
account only linear operations taking place within the quantum device. Such an approach 
is also called the S-parameter method. The X-parameter method is a generalization of the 
S-parameter method and is used for characterizing quantum devices subject to large input 
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| Model of quantum transport based on scattering theory. Coupling of the quantum device to the electrodes is 


represented by a scattering matrix. 
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power levels that drive them into the nonlinear regime [301]. Here we focus on the S- 
parameter method and consider the simplest case of a quantum device with only two ports. 
Generalization to three or more ports is not difficult but is not considered in this book. The 
utility of the scattering matrix is that it fully characterizes linear operations of a quantum 
device without requiring a detailed knowledge of all components and processes inside the 
quantum device. This feature makes it useful in practice, especially for devices that have 
complex internal setup. 

Figure 5.4 shows a two-port quantum device, each port supporting a single channel (or 
mode). The dynamics of the device is governed by a potential barrier situated at its center. 
One can view the ports as transmission lines bringing the wave function to this potential 
barrier for scattering. We employ a local coordinate system, with z, and r,, denoting 
the longitudinal and transverse coordinates at the port 7 with n = 1,2. The origin of 
these coordinates lies at the device’s center. Assuming that the current flows along the 
longitudinal direction, we ignore the transverse coordinates in the following discussion. 

All charge particles (electrons) move freely on both sides of the potential barrier and 
are affected by this barrier only at z, = 0. Thus, the wave function of each particle is in 
the form of a plane wave. Consider one such particle moving in the forward direction. Its 
associated plane wave arrives at the potential barrier located at z, = 0 from the left. As 
a result of scattering, this plane wave is split into reflected and transmitted waves. How- 
ever, as the scattering process is assumed to be elastic, the number of particles as well as 
their total energy and momentum must be conserved before and after the scattering event. 
Assuming that the ports of the device are symmetric, the velocity of the particle at each 
port is given by v,(E) = hk,(E)/m, where m is the particle’s mass, hk is its momentum, 
and E = hk; /2m is its energy. Since velocities are equal for a given energy, kņ is the same 
at both ports. In general, both E and v, can vary over a wide range. 

As a result of scattering, there will be reflected and transmitted waves at both ports of 
the device. At any given point along either of the ports, the total wave function will be a 
superposition of these two waves. As the scattering matrix is defined using the amplitudes 
of the forward and backward propagating plane waves (denoted by a, and b,), the flux 
density corresponds to |a} |? and [Dy |2. We normalize these amplitudes such that the particle 
number is preserved. In mathematical terms, this normalization amounts to writing the 
incident plane wave in the form (z,,ky) = (27 hv)! 2 exp(ikņZņn). This form ensures 
that the following relation denoting the conservation of particles holds: 
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A two-port quantum device and scattering parameters associated with it. The same dS parameters are also used for 
commercial vector-network analyzers. 
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= f 5(ky — K) dky = 1, (5.4) 
n 
where we used the relation dE = hv, dky. 
With the preceding normalization, we can write the wave functions on two sides of the 
potential barrier in the form 


j= eee exp(ik1z,) + bı exp(—ikiz,)] ifzņ < 0, (5.5) 


(2x ivz)~'/ [bz exp(ik2z») + a2 exp(—ik2zņ)] if zņ > 0, 
where a; and az are the incident wave amplitudes and bı and bz are the scattered wave 


amplitudes at ports 1 and 2, respectively. Also, we can set ky = k? and vı = v2 owing to 
the symmetry assumption. The scattering matrix is defined through the relation 


by} [Sir Siz} Jar 
ll- [s r] BE (5.6) 


Transmission and reflection coefficients of the forward wave at the potential barrier are 
found by setting a2 = 0 and are given by 
by 


> ris == 
ai 


y= — (5.7) 


ai 


a2=0 a2=0 


If the input comes from the other port, the transmission and reflection coefficients, tı2 and 
r22, are obtained by setting aj = 0. Using them, it possible to write the scattering matrix 
in the form 


S= P | (5.8) 


As any wave function satisfies the Schrödinger equation, these four coefficients can be 
found by solving this equation using the form of the wave function given in Eq. (5.5). The 
resulting scattering matrix S is unitary owing to the flux conservation (no particles are lost 
during the scattering process), 


S'S =SS' =1. (5.9) 


The situation changes if particles of the same energy have different velocities at the two 
ports. In this case, we can still use the wave function given in Eq. (5.5) with kı Æ k2 and 
vl Æ v2. Following the same procedure, we can calculate the matrix elements as before, 
while maintaining the unitary nature of the scattering matrix. 

It is possible to generalize the preceding formalism to the multichannel case, where 
the particles at each port can belong to different channels (also called transverse modes). 
When a port is in the form of a waveguide, the number of modes supported by that port 
depends on the geometric dimensions of the waveguide and the effective mass of charge 
carriers. In this case, channels are analogous to the electromagnetic modes of a microwave 
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or optical waveguide found by solving Maxwell’s equations. The only difference is that the 
eigenfunctions of the Schrödinger equation provide the modes for the channels. 

If N; and N> denote the numbers of channels at the two ports, the incoming and outgoing 
amplitudes can be written in a vector form using the following two column vectors: 


ay bii 

at pa A], (5.10) 
a21 ba 
aN, bon, 


where the first subscript denotes the port and the second subscript denotes the channel 
of that port. For the two-port case, the resulting scattering matrix is a square matrix with 
Nı + N2 rows and columns. It is sometimes possible to partition this scattering matrix into 
submatrices corresponding to reflected and transmitted wave functions, analogous to the 
2 x 2 scattering matrix in the single-channel case: 


S= ie 52] > S= is nae (5.11) 
S21 $22 t2 r2 
Here, the transmission matrix t21 has dimensions of N1 x M2, and the reflection matrix r11 
has dimensions of Nı x Nj. 
As the complexity of the system grows, it is useful to specify the notation clearly to 
identify individual components in the scattering matrix. We employ the standard notation 
used by equipment manufacturers such as Keysight™. For a two-port passive linear device, 


let the subscripts p and q denote the ports, and m and n denote the channels at these ports. 
The elements of the scattering matrix are then written as 


S = [Soman]. (5.12) 


However, when each port has a single channel, we can ignore the trivial mode index 1 
and use the compact notation S = [Senan] = [Spa]. We always ensure that a scattering 
matrix is unitary. If a nonunitary scattering matrix is encountered, we can make it unitary 


with the following scaling: 
| Vqn 
S(pm)(qn) = S(pm)(qn): (5.13) 
Vpm 


where Vpm and vg, are the velocities of a charge carrier in the designated channels. As this 
rescaling can always be performed, we only consider unitary scattering matrices in this 
book. 


5.2.2 Charge Transport in Two-Port Devices 


Even though the Landauer—Biittiker method can be used for multiport devices, we consider 
the simplest case seen in Figure 5.3, where two electrodes (acting as reservoirs in thermal 
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equilibrium) are connected to a quantum device (mesoscopic scatterer). Owing to the equi- 
librium state of the two electrodes, electrons inside them behave incoherently. However, 
if an electron leaves an electrode and enters the quantum device, its transport through the 
device preserves its coherence by undergoing only elastic scattering events. This type of 
transport is referred to as phase-coherent transport. As particle flux is conserved in this 
kind of transport, the scattering matrix provides a complete description of the passage 
through a quantum device. The Landauer-—Biittiker method determines the current flowing 
through this device using its scattering matrix, which depends on the device’s geometry. 
The reservoirs (contacts) come into this description through their equilibrium distributions, 
which depend on the chemical potential and temperature. 

To calculate the scattering matrix, we begin with the Hamiltonian of an electron with the 
effective mass me written as 


Pz Py 
H = Ln V 5.14 
n53 ; F Ime F Tin) ( ) 


where the kinetic energy of the particle is separated into its longitudinal and transverse 
parts, using Zņ and r , as the local coordinates in the two directions. The motion of the 
particle along the z, direction is not constrained, but it is quantized in the transverse plane 
owing to the confinement potential V(r_,,). One can find the “transverse” channels by 
solving the eigenvalue equation 


2 
P 
L EVE Ln) | YE Ln) = En Vpn Ln). (5.15) 


2me 


As usual, the transverse profiles of different modes are orthogonal and are normalized such 
that 


J WinTV) drin = nt mn- (5.16) 


Since the particle moves freely in the z, direction, its energy (dispersion relation) can be 
written as 
272 
kon 
Enn(kyn) = 7 + Enn. (5.17) 


Me 


The incoming and outgoing wave functions for the nth channel can now be expressed as 


Ann (Zn. Pin) = Ynn Ln) exp(+ikynZn)s (5.18) 
Dynn: PL) = Wyn Ly) exp(—iknnzn). (5.19) 


With these incoming and outgoing wave functions, we can use the scattering-matrix 
description to obtain the outgoing wave function for the nth channel at the port ¢ in the 
form 
brn =X Y. Senaman (5.20) 
n meé{n} 


We quantize the incoming and outgoing amplitudes using the second-quantization for- 
mulation and denote the corresponding annihilation operators by Gym and bym and the 
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creation operators by Oni and Di m respectively. With these definitions, the operator form 


of Eq. (5.20) can be written as 


ben(E) = Y Y Senom E ânnE), (5.21) 
n me{n} 

DiD = >> Y Senom E41, E). (5.22) 
n meé{n} 


Armed with these expressions, we can construct the wave function in the k-space associated 
with the particles traveling through each port: 


Yy = Wie Krn), (5.23) 
ne{n} 


where the wave function for the nth channel is given by 
Wyn knn) = Wyn Ln) [arn nde a bnn(knn)e =m | ` (5.24) 


The definition of current density given in Eq. (5.2) demands a wave function Y(t}, t) 
in physical space and time. We can obtain it by taking the inverse Fourier transform of the 
k-space wave function as 


co i 
U(r), = > f Py (Ey, kyn) exp EZ dkyn. (5.25) 
ne{n} “7° 


We substitute this result in Eq. (5.2) to obtain the current density and integrate across the 
transverse plane of the quantum device to calculate the current. The result is given by 


z A ð ð 
Inn, 1) = Fel | Vend wendra - J (EnD) Yalta} 
(5.26) 


In practice, the complexity of the problem can be significantly reduced by changing 
the integration from the k space to the energy domain while taking the inverse Fourier 
transform in Eq. (5.25). This can be done by using the dispersion relation in Eq. (5.17) to 
get dEņn = (A? kyn /Me)dkyn = ħvyndkyn. However, we also need to map the annihilation 
and creation operators to their energy-space representation. We recall from Chapter 2 that 
we need to ensure the invariance of the commutator or anticommutator relations, regardless 
of the representation used. We are dealing with electrons here, which are fermions. Thus, 
we need to ensure that the anticommutator relation is invariant in both representations. In 
the k space, this relation has the form 


Ginm(knmae (Ken) F Gi; (ken )anm( Kym) = ông Smnd(Knm m Ken). (5.27) 
In the energy domain, this relation is preserved if we use 


dng mn 
hvym(Enm) 


nm Enn âl, (Etn) SF âl (Etn ânm Erm) = ô(Enm Egn). (5.28) 
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The preceding equation shows that we can transform all creation and annihilation 
operators from the k space to the energy domain with a simple mapping of the form 


7 1 7 aA 

Yan(Kyn) > t———Tn(Eqn), where Y € {â,ât, b,b). (5.29) 
nn\®yn hvn Ern) nn\“nn 

Even though this transformation looks complex because the prefactor depends on the 

energy, it turns out that its use cancels out certain terms in the Fourier integral, simpli- 

fying it considerably. Using the preceding relations, we can write the wave function in Eq. 

(5.24) in the energy basis as 


Y, (Th, Eņn) = Ann (Enn) exp(ikynZy) =F byn Enn) exp(—ikn2y)] : (5.30) 


Vyn 


Wyn Ly) E 
/ħ 


We can now calculate the current given in Eq. (5.26). In view of the orthogonal nature 
of different transverse modes, when we multiply and integrate across the transverse area 
of the quantum device, only terms with the matching port and channel indices survive in 
the final expression. Consider the first integral in Eq. (5.26) with the wave function given 
in Eq. (5.25). Taking the z derivative and using the mode-orthogonality relation, we obtain 


ð E Kyn2 (E! 
f ennnen drin = 5 If ika (Eln) Kyni (Enn) nn2( nn) 
Hey ne{n} Vb n(Enn) (mEn) 


i 
x exp |-;@ — E| dE yn dE jn; (5.31) 


where we have defined the following two relations: 
Kynı (Eqn) = [expl—ikyn(Enn)2n]ay(Enn) F explinn(Enn)2n lbn Er) | (5.32) 
Kyn2(Enn) = [explikyn(Eiy,)2 Âm Eh) = expl—ikyn(En)2n mnt En) | : (5.33) 


We can simplify further by noting that speeds of electrons of different energies are not 
that different because speed is a slowly varying function of energy. If we assume that they 


are all the same in the energy range of interest, we can use Vyn(Eyn) © Von(Enn) = ae 
Using this approximation in Eq. (5.31), we obtain 
ð im 
J iEn 5 — Yan 1) din = a 5 I Kynt(Enn)Kyn2(Enn) 
ú ne{n} 
i 
x exp |-7@ — Ep) dE yn dE jn: (5.34) 


Using the same procedure, the second term of Eq. (5.26) is found to be 


O24 ime 7 
gz h:t) | Yan, Darin = ae > Kyn2(Enqn)K nt (Eqn) 
ne{n} 


x exp [-5 En = Ep) dEyndE,,. (5.35) 
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Using these results in Eq. (5.26), we finally obtain the current in the form 


1,(t) = z If > [mnt Erena E n) = Kyra En Kyat Ey) | 


ne{n} 


i 
x exp |- 5m — Ep) dE nn dEn- (5.36) 


This result can be simplified by invoking the anticommutator relations: [@yn, ake =i1 


and (eas bi l+ = |. It is easy to show that 


Kynl (Enn)Kyn2(Eqn) = Kyn2(Enn)K nl (Eqn) 


~ 24} (Ennå Enn) — 2b} (Eqn Onn En): (5.37) 


A final adjustment is required to account for the electron’s spin that has been ignored so 
far. As per Pauli’s exclusion principle, two electrons of opposite spins can be transported 
through each channel without any interaction. This is accounted for by multiplying the 
current in Eq. (5.36) by a factor of two. With this modification, the current through the port 
n is given by 


A 2 : i i R 
info = 1 T Y [a Emm gn En) — Bi Erb E n) 
ne{n} 


i 
x exp |- (Em = Ew] dE qn dE yn: (5.38) 


5.2.3 Average Current through the Quantum Device 


The preceding section has provided us with a current operator in terms of the creation and 
annihilation operators associated with electrons of different energies in different channels 
and ports of the quantum device. In this section we use this operator to calculate the aver- 
age current passing through the device. The first thing to note is that the byn operators are 
related to the ân operators through the scattering matrix, as indicated in Eq. (5.21). We 
can write this relation as DE) = S(E)a(E) for any energy E. Using it, we can replace the 
operator product bt (Enn)bnn (Enn) in Eq. (5.38) with an expression involving the elements 
of the scattering matrix and the operator product at, (Eņn)ânn (En). Thus, the current oper- 
ator ÌO) depends only on the operator âf n (Enn) Ânn Etn). which is related to the number 
density of incoming electrons. 

In most cases, we are interested in the ensemble-averaged value (i,(0) of the current 
operator because that is a measurable quantity. Given that this average is a function of 
the ensemble-averaged number density of electrons coming from a reservoir in thermal 
equilibrium, we can use the results of Section 3.1 to obtain 


(à$ (Enn âcm E, m) = nt nm (Enn i E; mfrn Enn), (5.39) 


where frFy(Eņn) is the Fermi distribution for the port 7. When an electrode is in thermal 
equilibrium at a temperature T, with a chemical potential 41, and an electrical potential 
V,,, its Fermi distribution is given by 
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Eqn — Hn — eV: fi 
SFn(Enn — qe Vn) = [exv (2a) + 1 $ (5.40) 
n 


If we know the density operator p of a quantum system, the average current can be 
calculated using (i,(t)) = Tr(pl,(t)). Using Eqs. (5.38) and (5.39) and integrating over 
En» We obtain 


(y(n) = ~% we 3 Ten (Enn)frn Enn — qe Vy) dEnn, (5.41) 


where 7; is the transmission probability (related to the elements of the scattering matrix) 
for an electron to go from the port 7 to the port ¢. Since some electrons from the port ¢ 
may also appear at the port 7 through scattering, the net current between these two ports is 
given by 


(i,() = “i f > Ten(Enn) [frn(Enn — deVn) —fre(En — geVc)| dE nn: (5.42) 


ne{n} 


This equation is known as the Tsu—Esaki equation and is useful in several different 
contexts. 

The final energy integral can be carried out under some restrictive assumptions. First, we 
write the transmission coefficient as X` Tey (Enn) = Men (Enn)T cn(Enn), where Ten (Enn) is 
the average probability that an electron with energy Eyn at the port 7 will be transmitted to 
the port ¢ and M,,; is the total number of channels available to this electron. By using this 
form in Eq. (5.41) and assuming that the average transmission coefficient is independent 
of energy and applied voltages, we obtain 


ÂO) = “He te Tey f DEn = aV) -fE Vo] dE. 643 


The Fermi factors appearing in Eq. (5.43) can be simplified by noting that the local 
chemical potential in the presence of a static electrostatic potential is given by Wy — qe Vn. 
Therefore, if the voltage difference between the two electrodes is given by Vy = Vy — Vz, 
we can use the approximation 


[rem = deVn) —fre(Eqn = qeVe)] dEnn x qde(Vi, = Ve) = deVne- (5.44) 


Under these assumptions, we obtain the following simple expression for the average 
current flowing from port 7 to ¢: 


a 242 = 
(In) © p Men Ten Vnz = GenVne, (5.45) 


where the conductance of the quantum device is defined using the well-known current— 
voltage relation J = GV and is given by 


2 


2 
Gen = Ae 


p MenTen: (5.46) 
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This is the main result of the Landauer—Biittiker method for two-terminal quantum 
devices. It is known to hold for temperatures as low as close to absolute zero. It can be 
shown that it holds approximately at high temperatures as well [302]. To get this simple 
result, we had to assume that the transmission coefficient is independent of energy and 
applied voltages. This assumption is quite restrictive and its validity ultimately determines 
the validity of the preceding simple expression [302]. Even though Eq. (5.45) is derived 
for a two-terminal device, it can be readily extended to a multiport device using 


Un) = X Gen(Vn — Vo), (5.47) 
tn 


where we used the relation V,; = Vy — Vç for the voltage difference between any two 
ports. 


5.2.4 Construction of Scattering Matrix 


The first step in applying the Landauer—Biittiker method is to identify the scattering states 
for all ports of a quantum device and build the scattering matrix of the device. In practice, 
several probing techniques are used to construct the scattering matrix through experimental 
measurements. These techniques rely on the Landauer—Biittiker method because Kirch- 
hoff’s laws are inadequate for the probing purpose. The reason is that contact resistances 
cannot be ignored for the conductance because of its quantum nature (see Aside 5.2). The 
most striking result is that we cannot arbitrarily increase the current through a quantum 
device by changing its dimensions and other properties. This is because, for a given volt- 
age, there is a limit to the maximum current a single scattering event can allow. This limit 
corresponds to the conductance quantum Gg given in Eq. (5.3). 


Aside 5.2 Contact Resistance 


When a nanosize metallic lead (called here a quantum lead) is placed between two wider 
(classical) electrodes, its resistance measured through these electrodes is not zero, even 
when the charge transport is fully ballistic (no electrons are scattered by the quantum lead). 
This finite resistance occurs at the interface between the electrode and the quantum lead 
and its origin lies in the different numbers of conduction channels within a broad electrode 
and a narrow quantum lead. 

Consider a quantum lead connected to two broad electrodes, labeled ņ and ¢ in Figure 5.5. 
Let us focus on two points 7’ and ¢’ inside the quantum lead but adjacent to the correspond- 
ing electrode. The relation given in Eq. (5.46) provides the conductance between 7 and ¢ 
because we used the electrochemical potential to track electrons. As the electrochemical 
potential is the total potential and includes both the chemical potential and the electric 
potential, it can be used to calculate the distribution of electrons in thermal equilibrium, a 
condition satisfied easily for a wide (classical) electrode. 

How do we find the conductance between the points n’ and ¢’ inside the quantum lead 
that is not expected to be in thermal equilibrium? It turns out that we can extend the same 
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quantum device 


electrode n è e n’ (5 Ce o electrode C 


Schematic illustration of the contact resistance. The outer black circles (77 and ¢ ) are inside the electrodes, whereas 
the inner circles (77’ and ¢’) are within the quantum device but close to their corresponding electrodes. 


reasoning to the quantum lead under steady-state conditions because none of the parame- 
ters change with time. The chemical potential has a well-defined quasi-equilibrium value 
in the steady state on both sides of the scattering barrier, assumed to be located in the mid- 
dle of the quantum lead. However, just after electrons are transmitted through this barrier 
(before energy relaxation has taken place) the distribution is severely distorted from its 
equilibrium Fermi distribution. As a result, strictly speaking, chemical potential is not a 
well-defined quantity inside the quantum lead. We can define a pseudochemical potential 
such that when we integrate the corresponding Fermi distribution over energy, we obtain 
the correct number of electrons. However, this pseudochemical potential cannot be used to 
calculate the electron distribution within the quantum lead (even though it provides the cor- 
rect number of electrons) because electrons are “hot” and their distribution deviates from 
the equilibrium Fermi distribution. 

Our goal is to use this pseudochemical potential to establish the potential difference 
between n’ and ¢’ so that we can calculate the conductance between those points. Con- 
sider electrons moving along the path n > n’ —> ¢' — ¢. Even though uy = py, we 
expect fiz’ # uç because the number of electrons on this side of the potential barrier is 
smaller as some of them have been scattered back. The pseudochemical potential in this 
situation can be written as 


Hg = Tentin +a- Ten)Ue, (5.48) 


where 1 — T¢ņ represents the number of electrons scattered back. Therefore, the potential 
difference V,,;" seen by the electrons between the internal points n’ and ¢’ is given by 


qe Vye = Uy — uy = (1 — Ten (My =u) =(1— Ten)deVne- (5.49) 


We get the same result for the potential difference if we track the electrons from the contact 
¢ to n. As the current flowing between the internal points is equal to the current flowing 
between the two wide contacts, we can use this result to calculate the conductance Gz, 
between the internal points n’ and ¢’: 

(O) _ 2q MenTen 


Gry = = =e 5.50 
Vive! h 1— Ten i i 


The contact resistance R,, can now be calculated as 


Ryn’ = = (5.51) 
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This result shows clearly that the contact resistance is negligible when the number of avail- 
able conduction channels is very large. This is the case for all classical (wide) leads for 
which M;,, — oo. However, a quantum lead only provides a limited number of conduction 
channels owing to its small transverse dimensions. This feature requires a redistribution 
of incoming electrons among a limited number of current-carrying modes at the interface, 
leading to a finite contact resistance (see also Ref. [303]). 


The four-port sensing method is a highly accurate probing method used in electrical 
engineering for measuring impedances. This method is also known as the Kelvin sensing 
method after its inventor, and the four probes are known as Kelvin probes. The method 
relies on using current and voltage probes. The current probes are assumed to have zero 
impedance (thus no scattering). The voltage probes when connected to the measuring 
sample have no current flowing through them. Engquist and Anderson applied such a mea- 
surement technique in 1981 to the Landauer—Biittiker method [304]. They found that, in 
principle, the conductance given in Eq. (5.46) can be measured if ideal voltage probes are 
used and conductance is calculated by taking the ratio of the current through the device to 
the voltage difference measured. 

Consider the four-probe configuration shown in Figure 5.6. For simplicity, we restrict 
our analysis to each lead having a single channel. The procedure will be similar in the 
multi-channel case except for algebraic complexity. The current flows from port | to port 2. 
The voltage probes 3 and 4 are weakly coupled to the sample; weak coupling can be real- 
ized in practice by using tunnel junctions as the interface. As no current flows through the 
voltage probes (3 = I4 = 0), we have 1 = -h © (242 /h\(V1 — V2), as required by 
charge conversation. As there is no current flow between ports 3 and 4, we could assume 
that associated impedance is very high, enabling us to ignore the effects of G34 and G43. 
With these simplifications, we obtain the result [305] 


T31T42 — T32T41 
4 — 
(T31 + T32)(Ta1 + T42) 
where we used the transmission coefficients instead of conductances. 


Our task is to measure the conductance between ports 3 and 4. This can be done by 
accounting for all scattering events between these two ports through the transmission and 


V3 


(Vi — V2), (5.52) 


quantum device 


The four-port sensing setup. Current flows from port 1 to port 2. The two voltage probes 3 and 4 are weakly coupled 
to the sample such that no current flows through them. 
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reflection coefficients such that T+-R = 1. If the two voltage probes are identical and placed 
symmetrically as seen in Figure 5.6, we can assume that the transmission from port 1 to 
port 3 is the same as from port 4 to port 2. These transmissions include two contributions: 
direct tunneling with some small probability € « 1 (because electrodes 3 and 4 are weakly 
coupled) and tunneling after reflection from the scattering region with the probability eR. 
The application of the superposition principle leads to T31 = T42 = e(1 + R). On the other 
hand, the probability of going from the first port to the fourth one (and from port 3 to port 
2) is T32 = T4, = eT. Using these results in Eq. (5.52), we obtain V3 — V4 = R(Vj — V2). 
Therefore the conductance between ports 3 and 4 is given by 


I 242 T 


G= = —-_. 
V3 — V4 h R 


(5.53) 


This is identical to the expression in Eq. (5.50), confirming the validity of this method. 

The main limitation of the Landauer-Büttiker method is that it breaks down as soon 
as charge carriers start to interact with each other or other parts of a quantum device. 
In many nanoscale devices, interactions of an electron with other electrons or phonons 
cannot be neglected. In this situation, the Landauer-Büttiker method may not describe 
the underlying physics even qualitatively. Over the last three decades, a method based on 
the nonequilibrium Green’s function has emerged as the method of choice for analyzing 
current flow through quantum devices. As discussed in the following section, the versatility 
of this method stems from its ability to handle all kinds of interactions that charge particles 
and transport channels may endure when carrying current. 


5.3 Nonequilibrium Green’s Function Method 
A) 


The nonequilibrium Green’s function (NGF) method has been a workhorse of computa- 
tional many-body studies for several decades. However, its widespread use does not imply 
that the underlying concepts and their implementations are easy or intuitive. Indeed, one 
has to spend a considerable amount of time mastering the theory and then make reasonable 
approximations to account for many interactions endured by the charge carriers during 
their transport. Being a generic technique, it has been applied to study the dynamics of 
plasmas, electrons, spins, and phonons in various materials. It has been found that the NGF 
method can provide an accurate description of a variety of nonequilibrium charge-transport 
scenarios under reasonable approximations. 

Historically, Martin and Schwinger described many-body effects as early as 1959 using 
the NGF method from a unified nonperturbative point of view [306]. Further developments 
in this area were due to Kadanoff and Baym [307]. Later, this method was applied to super- 
conductors and to molecular electronics. Keldysh used a graphical technique, analogous to 
the Feynman diagrams in field theory, to develop Green’s functions for particles in a sta- 
tistical system that were subjected to an external field [308]. Even though they appeared 
quite different, the equivalence between the technique of Kadanoff and Baym and that of 
Keldysh was demonstrated in 1975 by Langreth [309]. The graphical technique of Keldysh 
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was advanced further in Refs. [310, 311, 312] using the NGF method. Meir and Wingreen 
showed in 1992 that the Keldysh formalism can be used to derive the Landauer formula 
for the current passing through a region of interacting electrons [313]. The NGF method 
was also shown to be a versatile tool for calculating the response of a biased, double- 
barrier, quantum well to a small alternating voltage [314]. Jauho et al. used this method in 
1994 to calculate the time-dependent current through a mesoscopic region coupled to two 
leads under the influence of external time-dependent voltages [315]. These studies fueled 
the use of the NGF method for describing charge transport while incorporating fully the 
interactions of electrons with photons, phonons, or other electrons. 

In spite of the preceding developments, the NGF method remained unaccessible to the 
engineering community owing to its complexity and its mathematical nature. Datta bridged 
this gap in 1989 with his work on a quantum kinetic equation [316, 317], and he made 
further advances in later years [318, 319, 320]. Many resources on the NGF-based transport 
modeling can be found at the website known as nanoHub [321]. Interested readers may also 
find Refs. [8, 9, 167, 302, 305, 322] useful for understanding the NGF method. 


5.3.1 Evolution of Quantum Operators 


The use of the NGF method for calculating currents in nanoscale devices can be illustrated 
by considering a simple scenario shown in Figure 5.7. The quantum device in the middle 
is connected to an electrode on each side through a coupling region. The two electrodes 
are assumed to be in thermal equilibrium with no direct coupling between them and have 
properties identical to those used for the Landauer—Biittiker method (see Fig. 5.2). The 
electrochemical potentials of the electrodes can be shifted by an external voltage to enable 
the flow of electrons from one electrode to the other via the quantum device. Electrons do 
not interact with each other inside the electrodes, but they do so inside the quantum device. 
To calculate the current, we treat the electrodes and the quantum device as separate entities 
before current begins to flow [323]. More precisely, the two electrodes are separated from 
each other for t < 0 and transport charges through the device starting at t = 0. As we see 
later, even with these simplifications, the model becomes challenging computationally if 
we include various interactions among electrons and phonons that lead to phenomena such 
as Coulomb blockade and the Kondo effect. 


quantum device 
rN 


electrode 1 electrode 2 


i Awe KA 


coupling coupling 


A quantum device coupled to two electrodes via quantum operators. The electrodes do not couple with each other 
directly. 
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To understand the advantages of the NGF method over a direct solution of the 
Schrödinger equation, consider the steps involved in constructing the solution for the sys- 
tem described in Figure 5.7. If the Hamiltonian of the whole system is given by H, one 
can describe its evolution by solving the Schrödinger equation, but that is not an easy 
task in practice as the Hamiltonian H contains multiple parts representing electrodes, cou- 
pling region, and the quantum device. The Green’s function G(£) provides an alternative 
approach to solving the same problem. It satisfies the following equation in the energy 
basis: 


(Œ — H)G(E) = 1, (5.54) 


where J is the identity matrix. The Green’s function enables us to write the response of a 
quantum system to a constant perturbation |ô Y) as 


(E — H) |W) = —|8W) > |W) = —G(E) 8Y) . (5.55) 


The second equation is much easier to solve compared to the perturbed Schrödinger equa- 
tion. For technical reasons, Green’s function requires the use of a contour in the complex 
time plane known as the Keldysh contour. We discuss this aspect before describing the 
current flow through a quantum device. 

In Section 2.2.5, we introduced the time-evolution operator U(t, tọ) that describes the 
evolution of a quantum system from an initial time fp to a later time ¢ (see Aside 2.14). As 
discussed there, when the system’s Hamiltonian is independent of time, the time-evolution 
operator has the simple form: U(f,t9) = exp[—(/A)(t — to)H]. When the Hamiltonian 
depends on time, the unitary operator evolves in a more complicated manner as 


> ft 
U(t, to) =T {exp (-; f Hoar )} ; (5.56) 
to 


where 7 is a time-ordering operator; it rearranges the operators in chronological order such 
that an operator with a later time is placed to the left of operators with earlier times. How- 
ever, this ordering depends on the nature of the particle (fermion or boson) whose dynamics 
is being studied. More specifically, the time ordering of the product of two operators, A(t) 
and B(t), is written as 


T {A(t )B(t2)} = u(t, — t2)A(t1)B(t2) + ulh — 4) B(t2)A(t1), (5.57) 


where u(t) is the Heaviside step function and the positive or negative sign is chosen for 
bosons and fermions, respectively. The time-evolution operator satisfies the group property 


U(t, U(t, t) = U(t, t), (5.58) 


where f is an intermediate time such that t} < f < t2. This is an important property, and it 
enables us to construct the Keldysh contour by choosing tı to be a complex number. 

We allow time to take complex values and choose t1 = tọ — it; and t2 = to, where 
ti = h/(kgT) depends inversely on the thermal energy kgT. The evolution operator for a 
closed system with the time-independent Hamiltonian can then be written as 


U(to — iti, to) = exp(—t;H /h) = exp(—H/kpT). (5.59) 
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If we use a grand-canonical ensemble for a system in thermal equilibrium, we know from 
Section 3.1.2 that its density operator is given by pgcg = exp[—(H — uN)/ħikgT], where 
is the chemical potential. Using it, the ensemble-averaged value of any operator A(t) is 
given by 


(AW) = Tr[pGcEA(d)] = Trl pcceU7 | (to + t, to]A(to)U(to + t, to)] 
Trlocce] Trl occe] l 


(5.60) 


Assuming that the operators H and N commute, pace can be written in terms of the 
evolution operator U as 


PGCE = exp(UN/kgT)U (to — iti, to). (5.61) 
Using this form, the ensemble-averaged value can be written as 


AO) = Trlexp(uN /kgT)U(to — iti, to) U(to, to + tA(to)U(to + t, to)] (5.62) 
= Trlexp(uN/kgT)U (to — iti, to)] l l 


The three U operators in the numerator of this equation show the path taken in the com- 
plex t plane and shown in Figure 5.8. This contour is called the Kadanoff-Baym—Keldysh 
contour or just the Keldysh contour [308]. The same contour is known as the Schwinger— 
Keldysh contour when the imaginary strip is neglected. The Keldysh contour has three 
branches: a forward branch from fo to tọ + t, a backward branch from fo + ft to to, and a 
branch along the imaginary time axis from tọ to fo — it;. When using the contour, any point 
lying on the imaginary time axis is taken to be a later time compared to all points lying on 
the forward or the backward branch (along the real-time axis). 

Owing to the group property of the evolution operator U, the intermediate time ?’ in 
Eq. (5.58) can take any value, including a complex one. Through the technique of analytic 
continuation, we extend Eq. (5.62) and replace t with a complex variable z. It is important to 
recognize that the operator A(t9) commutes with U, and we can lump the three U operators 
together and write Eq. (5.62) in the following form: 


Tr {exp(uN/kpT)Tc [exp (~$ fo H dz) Az, |} 


(A(zi)) = : (5.63) 
: Tr {exp(uN/kpT)Tc [exp (—7 fo Haz) |} 
Im(t) 
A 
t t t 
= > Re(t) 
t 
tT 


The Kadanoff—Baym—Keldysh contour in the complex time plane. The arrows indicate time ordering. 
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where z; denotes a point on the contour C shown in Figure 5.8. The contour time-ordering 
operator 7c rearranges the operators in a chronological order, after considering the com- 
mutative properties of operators involved. The operator A,, denotes the position of the 
operator A on the contour at any point z; (i.e., it is an indexing variable on the contour C, 
consisting of three branches: a forward branch C+, a backward branch C_, and a vertical 
branch C»). Note that the notion of earlier or later times is different on the contour from 
that of the real time. For example, on the branch C_, a point earlier in real time appears 
later when we consider time ordering along this branch. 

The behavior of the quantum system is quite different on the vertical branch C, com- 
pared to the other two branches because z; = —if; is purely imaginary along it. Since H is 
a constant, the contour integral in Eq. (5.62) along this path leads to the simple relation 


H dz = H{(to — iti) — to] = —iHt; = —iHh/(kpT). (5.64) 
Cy 


Substituting this result in Eq. (5.62), we obtain 


_ Trfexpl—(H — uN)/ksT]A} _ 
(AG) = Fee any fiery = Mlescedl, (5.65) 


where the cyclic property of the trace was used. We note that the resulting expression is 
independent of z; and coincides with the thermal average expected before the time evo- 
lution occurred. We can draw the following conclusions from these results. The average 
(A(z;)) corresponds to the statistical average of the observable A when z; lies on the for- 
ward and backward branches. In contrast, it corresponds to the thermal average of the 
observable A before the system is disturbed when zy lies on the imaginary time axis. 

When using the Keldysh contour for calculating Green’s function, we frequently 
encounter convolution-type integrals of the form 


Rat) = | Pen. r)OCe tar, (5.66) 
c 


where P and Q are two time-dependent operators and C is the Schwinger—Keldysh contour 
in Figure 5.8. To evaluate such integrals, we follow the prescription provided by Langreth 
[309] and let tọ —> —oo. When ¢ and f lie on the same branch, R(t, t’) can be calculated 
using the conventional time ordering. However, when t and f lie on different branches, we 
label R(t, t’) as follows (in a way similar to what is done for Green’s functions): 


e Rit, t) = RX (t,t) ift € Cy and?’ € C_; referred to as “lesser” R(t, 1’), 
e R(t, t) = R° (t,t) if t € C_ and?’ € C4; referred to as “greater” R(t, 1’). 


Using the “greater” and “lesser” functions, we can define two important functions as 


e R(t, t) =ult— t) [R> c, r) — R<(t, r)]; referred to as “retarded” R(t, t’), 
e R(t’) = u(t — t) [R<(t, )— R*(t, ty]; referred to as “advanced” R(t, r’). 


Notice the identity: R(t, t) — R(t, t) = R? (t,t) — R< (t,t). 


177 5.3 Nonequilibrium Green's Function Method 


Let us first calculate R< (t, £t’). Following Langreth [309], we deform the Schwinger- 
Keldysh contour with another contour consisting of an upper part Cy and a lower part Cz, 
as shown in Figure 5.9. Then R“ (t, t’) can be written as: 


nune] Pe DOE Dde + f P<(t,t)O(t,t’) dt. (5.67) 
Cu 


CL 


In the contour integral along the path Cy, we use Q<(t,t’) because there t € Cy but 
t € CL. For the same reason, P<(t, t) is used for the contour integral along the path Cz. 

The first contour integral along the path Cy consists of two parts that can be combined 
as follows: 


t =00 
/ P(t, t)O< (t,t) dt = f P* (t,t)QS (t,t) dt +f P(t, t)QS (t,t) dt 
Cu [0.6] t 


= T u(t — t) [P> (t,t) — P< (t, 1)| Q< (t, 0) dt 


—00O 


= f P(t, t)O<(t,t') dt. (5.68) 


The same procedure is followed for the second contour integral along the path Cz: 


d —o0 
f PX (t, t)Q(Tt, t) dt = / P<(t,t)O(t,t’)dt + i P<(t,t)O(t,t’) dt 
CL [0:0] t’ 


= F u(t’ — t)P<(t,t)[O<(t, 1) — Q7 (t,t) dr 


= T P<(t,t)Q" (t,t) dt. (5.69) 


Combining Eqs. (5.68) and (5.69), we obtain Langreth’s result 


+00 
R(t, f) = / [P E, tO (T, t) + P<(t,7)O (t,1)] dr. (5.70) 


=00 


Schwinger-Keldysh contour 
t 


t 
ra > Re(t) 


, 


Langreth contour 


Langreth’s modification of the Schwinger—Keldysh time-contour C in the complex time plane. As a result of this 
modification, we have C = Cy U G. 
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Using the same procedure, we can write R” (t, t’) as 


+00 
R?” (t,t) = f [P*(t, 1)O7 (t,t) + P*(t, 1) (T, r)] dr. (5.71) 


(oe) 


These functions can be used to obtain the retarded function R* (t, t’) as 
Rt (t,t) = ult — t) [R7 t,t) — R<,1)] 


=uļlt—-ť) T [P (t, tO" (t,t) + P>, 1)O (t,¢')] dr 


— ult- t) / [PHE 1)Q< (T, t) + P(t, t) (t,1)| dr. (5.72) 


Using the definitions of P*(t,t) and Q7 (t,t), we can combine the two terms in the 
preceding equation to obtain 


t 
R+ (t,t) = / P*(t,t)O* (t,t) dt. (5.73) 
t 
Similarly, we can show that the advanced function R7 (t, t’) can be written as 


t 
R(t, t) = f P(t, 1)Q (t,i) dt. (5.74) 
r 


5.3.2 Current Flow through a Quantum Device 


To apply the NGF method to the quantum system in Figure 5.7, we need to first find its 
Hamiltonian Hygr. This Hamiltonian should include all parts of the system including 
the two electrodes (called contact leads), the quantum device, and the coupling regions 
between the contact leads and the device. Thus, the Hamiltonian has the general form 


Hyer = Hcu + Hy1 + Hop + H2 + Hcr, (5.75) 


where the subscripts CL and x are used for the coupling lead and the coupling with 1 and 
2 added for the two sides of the quantum device. The device Hamiltonian Hgp depends on 
the details of charge transport and may contain terms related to interaction of an electron 
with excitons, phonons, and other electrons. To allow for these terms, we assume that Hop 
is a function of multiple creation and annihilation operators, 


Hop = Hop (aint) , (5.76) 


where the curly brackets denote the entire set of these operators. 
The Hamiltonians of the contact leads depend on the applied voltages such that 


Hein = > Eng — qen (OM ing (n= 1,2), (5.77) 
S 


where the operators it ç and lnc create or annihilate single electrons of energy Enc and V} (£) 
is the potential on the left or right electrode for n = 1,2. A single index ç is used to account 
for the electron’s momentum ñik (discrete or continuous k), its spin, and other conserved 


179 


5.3 Nonequilibrium Green's Function Method 


quantities needed to represent the quantum device. This index can be conceptualized as a 
generalized channel vector describing the current flow, because each value corresponds to 
one transport channel through the device. 

The coupling Hamiltonian between the leads and the quantum device contains products 
of the creation and annihilation operators associated with them. We use the normal ordering 
(or Wick ordering) and place all creation operators to the left of the annihilation operators. 
With this choice, this part of the Hamiltonian has the form 


Hyn = 5 (rsat $ eae) , (5.78) 
ç,n 
where the coupling coefficients xy, are found self-consistently considering changes in 


the distribution of electrons resulting from the current flow. 
Using the total Hamiltonian given in Eq. (5.75), we can calculate the current flow- 


ing from each contact electrode toward the quantum device using J = dQ/dt, where 
Q) = —qe (ñ ©) is the total charge at the electrode and Ñ is the number operator for 
electrons of charge —qe. The number operator at each electrode is given by 
Ân= ees. n=1,2. (5.79) 
s 
In the Heisenberg picture, the evolution of N, is governed by 
d a i R i A 
a = 7 incr, Ny] = 7 tan Nal (5.80) 


where we used the fact that all parts of Hygr commute with N, except the coupling part 
Hy,. We can now calculate the current as 


d a j « 
h= -a($ ) = a (LH y,Nnl) - (5.81) 


This relation shows that the current depends only on a single commutator involving the 
coupling Hamiltonian. 

We can calculate this commutator by using Hyn from Eq. (5.78) and writing the current 
in the form 


q ai a pEr AA 
h= = (x (xncalh can + ae) > Bes. (5.82) 
gn ç 


Clearly, we need to evaluate multiple commutators of the form [i le 19n> l ” c2ly2ç2]. We 
can simplify this commutator as 
E E 4 le: es A, Ne 
icin Li2ca!n2¢2] = (alaaa = Danana) Qn 
= —Snicin2lroco4n- (5.83) 


Similarly, Berns Uo eobnrca] = Snictn2<24tln22- With these results, the current in Eq. 


(5.81) takes the form 
i oe ae nOr 
h = E D (xen Hh OO) - kien Ohne O)) (5.84) 
own 
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The preceding expression contains ensemble-averaged operator products that can be 
expressed as combinations of Green’s functions. Several types of Green’s functions are 
discussed in Aside 5.3. Here we use two lesser types of Green’s functions defined as 


Ginet- 1) = — > (aO (0) = = (OO) (5.85) 
Grenlt — t) = = MORo) = 7 (Gen), (5.86) 


where a comma between the subscripts of a Green’s function is used when the operators 
from two different parts of the system are involved. Such Green’s functions are called 
mixed types. 

It turns out that the energy-domain representation of Green’s function is more useful for 
calculating the current. Taking the Fourier transform, we use the relation 


Gre, < „(E = 1G nen(T) EXP ($e r) dt. (5.87) 
As Gie nt — t) = — [e w= D|, we can use the relation Gee nL) = -|é = <n] 
and write the current in the form 
h= rh af 2i Xng, nGy, ‘nc E) + Mies n [c n Ke] | ae . (5.88) 
on 


Using A = h/2z, this equation can be written in a more compact form as 


_ 24e 
246 a YR [xncnCkn (E| 4E, (5.89) 


çn 


where Ñ denotes the real part. 
The Green’s function Grn Œ), the only one needed to calculate the current, can 
be obtained by applying the Langreth theorem to the time-ordered Green’s function 


encountered earlier and defined as 
i a 
Grnclt— 0) = (T {ano O): (5.90) 


As the leads are assumed not to interact directly with each other (e.g., via tunneling), 
the calculation of this Green’s function does not generate the Bogoliubov hierarchy. Its 
equation of motion has the form (see Aside 5.4) 


ð 
iN Gnne(t = 1) = ônngôlt — t) + EncGnnc(t = t’) + F Catia CID 


m 


showing that Gn,nc(t — t) can be written in terms of the Green’s function Gy,(t — r) 
associated with the quantum device. Using this equation, one can show that (see Ref. [324] 
for details) 


G%,„-(E) = X tem Gt (Gs (E) + Gin EVG pg (ED) (5.92) 
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Aside 5.3 Several Kinds of Green’s Functions 


To make the following discussion applicable to both bosons and fermions, we adopt 
the notation [æ, 8], = aB + xBa, where x = +1 for fermions and —1 for bosons 
(see also Section 2.4.1). The statistical average is taken using the initial density opera- 
tor at time t = tọ (ie., (Ô) = Tr{p(to)O}). Let {Gc, qi} represent a set of creation and 
annihilation operators, where the index ¢ spans through the available single-particle quan- 
tum states. Using these operators, several types of Green’s functions can be defined as 
follows [322]: 


Lesser and greater Green’s functions 
These two basic Green’s functions are defined using the product of creation and annihila- 
tion operators: 


i 


GREN) = = (GOO), (5.93) 
1 iJ, afry 
G21) = -7 (AORO) (5.94) 
Retarded and Advanced Green’s functions: 


The preceding two Green’s functions can be combined to construct the retarded and 
advanced Green’s functions as 


GL (61) = ult — 1) (62,60) - CREN), (5.95) 
Gz, t) = ull = 1) (GEE) — G60). (5.96) 


The step function u(t — t’) ensures causality. Using the definitions of the lesser and greater 
Green’s functions, the preceding two functions can be written as 


GRED) = -zult — 1) (4,940) (5.97) 
Geoltst) = zul — (14,4 OI) (5.98) 


The retarded Green’s function is useful because it can be used to find the linear response 
of any quantum device. For example, using it, we can characterize deviations from equilib- 
rium (dissipation in the case of the electrical conductivity) in terms of fluctuations, leading 
to the fluctuation-dissipation theorem discussed in Section 3.3. 


Time-ordered Green’s function 

The time-ordering operator 7 rearranges a product of two time-dependent operators such 
that operators evaluated at a later time always appear to the left of the operators evaluated 
at earlier times. Such a Green’s function is defined as 


Geolt,t) = -Ż (T foko). (5.99) 
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It can be related to the lesser and greater Green’s functions as 


Ñ i /\ [x at! i / At lyr 
Gott t) = = ut = 1) (95044 0)) + utd — (G40) 
= u(t — 1G. t) + xut — t)GZo(t, t). (5.100) 


It is possible to attach a physical meaning to this Green’s function. For example, consider 
the case of electrons. For t > t’, G¢o(t, t) describes motion of an electron created at time 7’ 
in the state ọ and detected at time f in the state ç. In contrast, for ¢’ > t, Geolt, t') describes 
the motion of a hole from the state ç at time ft and detected at the state ọ at time f’. 


5.3.3 Lehmann Representation of Green’s Functions 


Lehmann representation provides a rigorous way to describe Green’s functions using 
eigen-energies of the device Hamiltonian. It also paves the way for the energy-basis 
representation of a Green’s function, which is often suited for calculations in practice. 

As we have seen, Green’s functions require averaging over an equilibrium ensemble. 
This must be done by invoking a grand-canonical ensemble because the creation and 
annihilation operators change the number of particles. As we discussed in Section 3.1.3, 
ensemble averaging of any operator O in this situation is carried out using 


(Ô) = TrlpccrOl/Z,, (5.101) 


where pGce = exp(—Hert/keT), Z, = Trlecce], Hef = H — uN, H is the device 
Hamiltonian, u is the chemical potential, and N is the number of particles. 

We denote by {|)} the complete set of orthonormal eigenstates of the Hamiltonian Heff 
obtained by solving Hete |n) = En |n), where En is the energy associated with the state |n). 
It is easy to show that 


(m|gc(t)|n) = (m| exp(iHerrt/h)G- exp(—iHerst/h) |n) 
= exp[i(Em a E,)t/h] (m|q<\n) > (5.102) 


where we used exp(—iHefft/ĵA) |n) = exp(—iE,t/h) |n). Using this result, we can calculate 
the average in Eq. (5.101) in the basis {|7)} to obtain 


(Ò) = (n| exp(—Here/kgT)O |n) /Zy = exp(—En/kpT) (n| O |n) /Zy. (5.103) 


The preceding result can be used to calculate the lesser Green’s function as 


< J i atcd\r 
Gig) = Z Yo (n| exp(—Hetr/ke T) Càs (A) In) 


= _ > (n| exp(—Hest/kpT)9} (0) Im) (m| G¢(t) In) 
H n,m 
= > > exp[—E,,/kpT + En — Emt — t] (nlĝġ lm) (m|gg\n). (5.104) 


H nm 
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Similarly, we obtain the greater Green’s function in the form 


Zot) = -z Do expl-En/keT + = (En — En) — £) (nll) (miĝi ln) . (5-105) 


H nm 


The retarded and advanced Green’s functions can now be calculated as they are related to 
Geglt, t’) and Ge(t, t') as simple linear combinations. We note that the eigenstates {|n)} 
can always be written as a linear combination of the Fock-space states, if the occupation- 
number representation is employed. 

Even though all Green’s functions are indicated as functions of two time variables 
(t and f’), the preceding results show that, in fact, they depend on time only through the 
difference t — t. This result is very important because it allows us to introduce one time 
parameter t = t — f and to write any Green’s function in the frequency space as a Fourier 
transform with respect to t. As we have already discussed, the Fourier domain provides a 
compact and intuitive approach to such complex problems. 

We can also show that Green’s functions depend only on the time difference t — t by 
analyzing the product laoo). Noting that the time-dependent operator has the form 


ĝc (t) = exp(iHert/h)g, exp(—iHefrt/ħ), we obtain 


(4.004,0) = = Trla 4-01 
m 


i i 


=t [pace exp (= Hert — )) 4 exp(- + eit —1))ae], (5.106) 


which is clearly a function of t — ¢’. This conclusion is a direct consequence of the time- 
evolution operator, U(t) = exp(—iHeret/h), which commutes with pgcg because both are 
functional of the same Hamiltonian Heff. 

Using t = t — f', the Fourier transform G(E) of G(T) is defined as 


G(E) = F{G(t)} = f G(t)exp ($e) dt. (5.107) 


When a Green’s function contains the step function u(t), we make use of the Sokhotski— 
Plemelj formula (see Aside 3.3): 


oo i i 
—Et | dt = ———— 1 
J. uep (+ ) T EIR im (5.108) 
li i = PVv.(=) ‘hi 5(E) (5.109) 
y0 Eitin ME. ee 


where 7 is an infinitely small positive quantity. 
As an example, consider the time-ordered Green’s function Geg(t,t’). Its Fourier 
transform (or the energy space representation) is given by: 


ČcolE) = F {ult = NOZEN) + xF {ul — G3061)} (5.110) 
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Using Eqs. (5.104), (5.105), and (5.108), the preceding two Fourier transforms become 


i 3 exp(—E,/kpT) (n\qc|m) (m|qe\n) , (5.111) 


F j ult — £)G2,(t,1)} = 
{uc ) col )| Zp aa E/ħ + (En — Em)/h + in 


j exp(—En/kBT) (n|qj,\m) (m|\G<¢|n) 
F fut- NG; a = z 2 S A e (5.112) 


The resulting expression for G(E) is called the Lehmann representation of the time- 
ordered Green’s function. It shows explicitly the dependence of the Green’s function on 
the eigen-energies of the effective Hamiltonian of the system. We can simplify this result 
further by swapping the dummy indices m and n in the preceding equation to obtain 


j exp(—Em/kBT) (m|qj,|n) (n|g-|m) 
F {we — G; (tt) = ee a aera aa (5.113) 


nm 


It is useful to define a new function, called the spectral function, as 


rÆ) = FS expl- Em/kpT) (m|Gj\n) (nl@clm) ô (E + En — En) - (5.114) 


Zu nm 


Using the spectral function, we can write the lesser and greater Green’s functions as 


GZ Ù = = T(E) exp (= = zei) dE (5.115) 


So) = z sf. T(E) exp (- 2) dE. (5.116) 


The energy-space representation of other Green’s functions can also be related to the spec- 
tral function. It is easy to deduce from Eq. (5.114) that T(E) is real and positive when 


$=. 


Aside 5.4 Equation of Motion for Green’s Functions 


In this Aside, we derive an equation of motion for the retarded Green’s function, GE, as 
defined in Eq. (5.95). It can be used to find such equations for other types of Green’s 
functions. We start with Eq. (2.62)) governing temporal evolution of an operator in the 
Heisenberg picture. In the case of the retarded Green’s function, the relevant operator is 
G¢(t) and its evolution is governed by 


dq A 
nae = [ĝ- (Ð, Hesr]. (5.117) 


We use this result to calculate the time derivative of the retarded Green’s function given 
in Eq. (5.95). Recalling that any Green’s function depends on its two arguments such that 
Git, t') = G(t — 1’) and setting t = 0, the equation of motion is found to contain the 
following two terms: 
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d ee eee l Gees ones 
FC = ~ 780 (00, dle) — zul — #9 (Lag /dt, h1) 
ĝ at 1 / ^ 7 

= 30 (0, atha = pout - N) (100, Herrl- 2$], (5.118) 


One may think of the second term as a higher-order Green’s function and derive a new 
equation of motion for this function, but that equation would contain a new Green’s 
function of still higher order, resulting in an infinite hierarchical chain of equations of 
motion. 


This complication can be avoided in the case of noninteracting systems, because the second 
term in Eq. (5.118) can be expressed in terms of Gt,- However, for an interacting many- 
body quantum system, a closed-form solution exists only if a conserved quantity can be 
found such that the commutator associated with one of the high-order Green’s functions 
vanishes. If that is not the case, one needs to truncate the infinite hierarchy by expressing 
one high-order Green’s function in terms of lower-order Green’s functions, resulting in an 
approximate solution. Such an approach is not systematic, and the solution depends on the 
specific truncation point used to find it. 


5.3.4 Current in Terms of Green’s Functions 
We are now ready to use Eq. (5.89) derived earlier for the current. We need to make sure 
that this equation leads to the Landauer—Biittiker expression obtained under the assumption 
that the two leads were not interacting with each other. The evaluation of the associated 
Green’s function becomes considerably simpler under this assumption. 


We start by writing the Green’s functions for the leads (electrodes) in the Lehmann 
representation. Because the applied voltage V} remains constant with time, we obtain 


i 2 
G, lt -—h)= pEi Ens) exp p (Ens — avn a 
ti 


= -frn (Eng) exp [-iEns m qe Vy)(tı = t)] (5.119) 


This result can be used to write the retarded and advanced Green’s functions in the form 


Gjt — t2) = — Fully — t) exp [iEn — 4eVn) — 12], (5.120) 
Grei — ta) = Fult — t1) exp [~En — qeVn)Xı — 1). (5.121) 


The Green’s function required for the current in Eq. (5.89) is a mixed-type Green’s func- 
tion defined as G, c(t — t) = —(i/ħ) (a0 nls )) Using the anticommutator properties 
of fermion operators in the leads and the quantum device, we express this Green’s function 


in terms of Green’s functions associated with the lead and the quantum device: 
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Girne (ti. t2) = E | teal [Gimli 13) Gino (3+ fa) + (5.122) 


Gin (t, 13)G;,- (tz, t2)] dts. 
This expression contains time-domain convolutions. However, as convolutions turn to 


products in the Fourier domain, we take advantage of this property and express the current 
in the energy basis, as we did earlier. Using Eq. (5.89), we obtain 


_2 e = 
246 ef y s| Dental G}, (ECX (E) + Ginn (E)G7 (E) || dE. (5.123) 
ç,n,m 


We can simplify the preceding expression by introducing physically meaningful spectral 
functions for the two electrodes in a way similar to the spectral function introduced in 
Section 5.3.3. Using the definition 


{Pn} an = 27 X 8E — Enc) Xnc nE) Xc mE), (5.124) 
l 
where 7 = 1 for the left electrode and 7 = 2 for the right electrode, we can write the 


current in the form 


Tr 


n(E = qeVn) x 


(5.125) 
[G) + fin E — qe V XG" (Œ) — 67 (E)] } dE 


where the trace operation represents the double sum over m and n and fry is the Fermi 
function in Eq. (5.40). This equation contains three Green’s functions, G<(E), Gt (E), and 
G (E), all of which correspond to the quantum device coupled to the two leads. It is impor- 
tant to recognize that they often cannot be calculated without making a few reasonable 
approximations (see Aside 5.4). 

Equation (5.3.4) can be used to find current /; flowing through the left electrode using 
n = | and the current J) flowing through the right electrode using n = 2. Under ideal 
steady-state conditions, Kirchhoff’s current law requires Jy = —J,. Using this result, the 
total current can be written as J = (1) — Jy)/2 and is given by 


ide 
27th 


l= 


f Te{ [PE — 4eV1) — Pal — geVD)]G“(B) 


+ [fri(E — qe VIE — qe V1) — fro(E — geV2)T' 2(E — qe V2)] 
[G+Œ) - Gf 4E, (5.126) 


where we included spin degeneracy by multiplying Z with a factor of two. 

We need to introduce further approximations to write this result in the Landauer—Biittiker 
form. As our focus is on the steady-state situation in the case of noninteracting leads, we 
calculate the lesser Green’s function by using the Keldysh equation 


G< (E) = G*(E)=<(E)G (E), (5.127) 
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where X < (E) is the lesser self-energy and can be calculated from the equilibrium relations 
using 


E< (E) = fri (E — qe VII (E — geVi) + fro(E — qe VTE — geV2), (5.128) 
where T(E) and I'2(E£) are the spectral functions defined earlier. We also note that 
Gt (E) — G7 (E) = —iG* (E)[T1(E — qeVi) + PoE — geV2)|G (E). (5.129) 


Substituting the preceding results in the expression for the current 7, and simplifying, 
we finally obtain the Landauer—Biittiker result: 


e/a 


wth 


I Tr [T1 (E — geVi)G* (EP 2(E — qeV2)G" (E)] 


x [fri(E — geVi) — fr2(E — qeV2)] dE. (5.130) 


An advantage of the NGF method is that it provides more insight than the single-particle 
scattering matrix method. The steady-state analysis carried out here with the NGF for- 
malism is known as the Keldysh formalism [308]. It has been used to investigate the 
steady-state quantum transport in many mesoscopic systems [324]. 

The NGF method has also been used to calculate time-dependent currents by consider- 
ing voltages varying with time [315, 325]. This work fueled the use of the NGF method 
to describe charge transport while incorporating the interaction of electrons with pho- 
tons, phonons, and other electrons. The time dependence is included into this formalism 
by making the parameters appearing in the Hamiltonian a function of time. As a result, 
En > En(t), Ene > Enç + AEnc(t), and Xy¢,n also become a function of time. With these 
changes, the time-dependent current through each electrode becomes (7 = 1, 2): 


Lee 94 Tr| C (E, r, t) exp ie -9 x 
1 h ! h 
(G50, t) + fra (EG G, D) | dt dE, (5.131) 


where the spectral function I’, is also time dependent: 


if! 
D,E, t1, t2) = 27 È Xnen(ti) exp (-; f AEnç(T) ar) Xi m(2)5(E — Ens (0). 
ç 2 

(5.132) 


The Keldysh equation, G< (E) = Gt(E)=<(E)G (E), in the energy domain becomes 
a double convolution when converted to the time domain. As a result, the two Green’s 
functions appearing in Eq. (5.3.4) are given by 


Gt) = Ghat) + ff GF Ct. )EC1.12)G9 (tod dry (5.133) 


GHE, = Got.) + J / Go (t1, DET (ti, 2)Gh (h, r) dt dtr. (5.134) 
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Here, the subscript “0” represents the unperturbed Green’s function of the system before 
the time-dependent voltage was applied. The self energies appearing in the preceding 
equations are also functions of time and have the form 


i i 
Eis) = i > [fer yE. t1, t2) exp Gua = m) dE, (5.135) 
n 


Ep) = -iun =) Y f Pymn(Esti.t2)exp ( -FE -1)) dE. (5.136) 
mn > h z il 9019 h 


These equations can even be used to calculate the nonlinear response of a quantum device 
controlled with external voltages. The reason is that the NGF method takes into account 
all correlations from the beginning of a time-dependent perturbation. In contrast, the equi- 
librium formalism does not consider correlations at all. This is the main reason why the 
Keldysh formalism is widely used for describing nonequilibrium operation of quantum 
devices. 


Quantum Tunneling 


“All right,” said the Cat; and this time it vanished quite slowly, beginning with the end 
of the tail, and ending with the grin, which remained some time after the rest of it had 
gone. 

Lewis Carroll, Alice’s Adventures in Wonderland 


6.1 Physics of Tunneling 
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Consider a particle of energy E approaching a potential barrier, whose maximum height 
Umax is such that it exceeds the particle’s energy (see Figure 6.1). According to classi- 
cal mechanics, the particle cannot penetrate the potential barrier, nor does it have enough 
energy to get over the barrier. As a result, in a classical world, the particle must get reflected 
at the barrier. Indeed, we observe such a behavior when billiard balls hit the boundaries of 
the table used to play the game. 


6.1.1 What Is Tunneling? 


A new possibility opens up in a quantum world, where quantum mechanics makes possible 
what is impossible classically. In quantum mechanics, the motion of a particle such as an 
electron is governed by the wave function W, which satisfies the Schrödinger equation, 
first encountered in Section 2.2. Even though the amplitude of Y must decay exponentially 
inside the classically forbidden energy region where E < Umax, as seen in Figure 6.1, it 
may still have a finite value on the other side of the potential barrier, if the barrier is not 
too wide. This indicates a finite probability of the particle to be found on the opposite side 
of the barrier and gives the appearance that the particle has penetrated the potential barrier. 
Such a phenomenon is referred to as quantum tunneling, or simply tunneling [326]. 
Tunneling is a genuine quantum effect with no counterpart in classical physics. Owing 
to its nonintuitive nature, this phenomenon remains an enigma to this very day. Tunneling 
is a direct consequence of the wave—particle duality, a core concept in quantum mechanics 
[40, 96]. Its existence leads to questions such as what path a tunneling particle takes 
through a potential barrier, or how much time it takes to tunnel through this barrier. There 
exists a vast amount of literature on the tunneling time and its dependence on the shape and 
other features of the potential barrier, leading to many controversies and paradoxes. The 
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Schematic illustration of tunneling across a potential barrier whose height Umax exceeds the particle's energy E. The 
behavior of the particle's wave function is also shown qualitatively, both inside and outside the barrier. 


source of confusion lies in the observation that no general method exists for determining 
the Hamiltonian that is canonical self-adjoint of the time operator [327]. 

As an example of a controversy, some measurements of the tunneling time seem to pre- 
dict superluminal speeds of the particle inside the potential barrier, leading to violation of 
the fundamental speed limit imposed by the theory of relativity. A well-known paradox 
is the Hartman effect discovered in 1962 [328]. It states that the tunneling time through 
a relatively thick barrier is independent of the actual thickness of the barrier. This state- 
ment implies that the tunneling speed has no upper limit, which also appears to agree with 
several experiments. Indeed, a 2019 experiment suggests that tunneling across a barrier 
occurs instantaneously [329]. Regardless of these paradoxes, tunneling is a real quan- 
tum phenomenon with a wide range of applications in areas such as scanning tunneling 
microscopy [330], nanoelectronics [331], and attosecond nanophysics [332]. Examples 
from everyday life include the flash-memory cards exploiting Fowler—-Nordheim tunnel- 
ing and a thermonuclear process inside the sun that fuses high-speed protons into helium 
nuclei. 

One can identify three distinct scenarios based on the boundary conditions imposed on 
the tunneling process resulting from the specific shape of a potential barrier. These three 
generic barrier shapes are depicted in Figure 6.2. Part (a) shows a double-well potential 
profile with a middle hump. In this case, the particle is confined to the regions bounded at 
both ends, but tunneling may still occur through the middle hump. Such potential barriers 
occur inside devices known as superconducting quantum interference devices (SQUID) 
and that are used for quantum computing [333]. Part (b) of Figure 6.2 shows another com- 
mon potential profile where a particle decays from a metastable state via tunneling. An 
example of this type of tunneling is provided by radioactive decay of polonium-212, lead- 
ing to the emission of alpha particles [334]. The potential barrier in this case is formed by 
the combined action of the strong and weak forces inside the nucleus and the electromag- 
netic force. Part (c) of Figure 6.2 shows the conventional tunneling case where a particle is 
not confined on either side of the barrier. Tunnel junctions provide a good example of such 
a potential profile [335]. The three basic potential profiles serve as building blocks of more 
esoteric potential profiles for which tunneling can take place. For example, we study later 
a resonant tunneling device, where tunneling via two conventional barriers is exploited to 
enhance the tunneling process. In this section, we focus for simplicity on potential barriers 
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Schematic illustration of three generic shapes of one-dimensional potential barriers that serve as building blocks of 
more complex potential profiles. 


that are one-dimensional. We should stress that the results obtained in this case may not 
always apply to higher-dimensional tunneling owing to the appearance of peculiarities that 
have no one-dimensional counterparts. 


6.1.2 Gamow’s Theory of Tunneling 


The tunneling concept was first invoked by Gamow in 1928 to explain the emission of 
alpha particles during radioactive decay of certain heavy nuclei [334]. Alpha particles are 
helium atoms whose two electrons have been removed, resulting in ions with a net positive 
charge of 2qe. These helium ions have two protons and two neutrons, tightly bound together 
by the strong nuclear force. The radioactive nuclei of heavy atoms (such as polonium) 
were observed to emit alpha particles (helium ions) at random times through a process 
known as the alpha decay. This emission was explained as tunneling of alpha particles 
(with energies 4-9 MeV) trapped inside a potential barrier (height about 30 MeV) formed 
by the combination of an attractive nuclear force and a repulsive Coulomb force resulting 
from the remaining charge of (Zp — 2)qe, where Zp is the atomic number of the heavy 
nucleus. Classically, an alpha particle of energy E < Umax cannot escape the potential 
barrier. The tunneling theory of Gamow describes how an alpha particle may occasionally 
tunnel through the barrier because of its oscillating wave function. His theory not only 
pointed to the underlying mechanism for this type of radioactive decay but also explained 
large variations in the mean lifetimes of various radioactive nuclei. 

Figure 6.3 shows the potential experienced by an alpha particle while it is confined inside 
the nucleus of a heavy atom such as polonium-212 (Z = 84). Gamow modeled the barrier 
as a spherical potential well of radius r, representing the size of the nucleus. Outside the 
nucleus (r > ry), alpha particles experience Coulomb repulsion arising from the remaining 
charge Z, — 2 inside the nucleus. This potential can be written as 

(24e)(Zp — 2)de 


U(r) = =, r>Tp. (6.1) 
4n Eor 


It can be used to find the distance rf (see Figure 6.3) at which an alpha particle escapes 


the Coulomb barrier. Using U(rf) = E;, where £; is the initial energy of the alpha particle 
inside the nucleus, we find 
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Emission of an alpha particle from a radioactive nucleus through tunneling. The schematic shows the nuclear energy 
well where the alpha particle resides and the Coulomb barrier that it must tunnel through. 


vp Z =) 


= 6.2 
f Ar egE; (6:4) 


Typical values of E; are estimated to be 4 to 9 MeV, whereas the barrier height at r = rp is 
about 30 MeV. 

Experiments indicate that the emission rate of alpha particles is proportional to the 
number Np of radioactive atoms present at that time. Mathematically, we can write this 


relation as 


dN, 
Z R = —KpRNR, (6.3) 
dt 


where Kp is called the decay constant. It is related to the transmission coefficient T, across 
the potential barrier, which depends on the shape of the barrier and can be calculated with 
the WKB method discussed in Section 6.2.1. This method shows that Kp is of the form 
Kr = Crexp(—2y), where Cr = ħ/(2ma rŽ) for an alpha particle with mass mg, and y is 
given by [see Eq. (6.36)] 


1 f” Im, E; f" 
J= F 2m_[U(”) — Ejldr = V2maki I {2 Ldr, (6.4) 
ħ ln nh Fn r 


where we used U(r) given in Eq. (6.1). The integration over r can be done with the variable 
change r = rf cos? 6 and leads to the following result: 


y= z 2maEi |cos7! vi - Yn — m| ; (6.5) 


where 7 = r,/r¢ is a dimensional ratio of two distances. For heavy radioactive atoms 
(Zp >> 1), this ratio is a relatively small number proportional to 1/Z,. If we assume n ~ 0 
and use cos~!(0) = x /2, we obtain the following result for Kr = Cr exp(—2y): 


h 2n |2mMa (Zp — 2)q2 
KR : 6.6 
g 2mar? exp l h E; 4T E0 (6.6) 


As seen in Table 6.1, this expression provides values of the decay constant that are quite 
close to those measured experimentally. 
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Nucleus Kr (experimental) Kr (theory) 
148Gq 2.2 x 10710 2.2 x 10710 
214p9 4.23 x 103 4.9 x 10° 

230TH 2.09 x 10713 1.7 x 10713 


6.1.3 Applications of Tunneling 


In a series of three papers, Hund used the concept of tunneling to describe the observed 
features of molecular spectra [337, 338, 339]. He analyzed the oscillations of electrons 
between two atomic bound states, modeled as a double potential well. A necessary assump- 
tion in his analysis was the separation of the motion of electrons from vibrations and 
rotations of atoms; an approximation later made quantitative by Born and Oppenheimer. 
Hund studied the dynamics of a bound pair of atoms and noted the presence of reflection- 
symmetric potentials with classically impenetrable potential barriers. He found the two 
stationary states, even and odd combinations of the atomic states, for a molecule made 
with two atoms, which can be identical or distinct atoms. Hund noted that the superposi- 
tion of these particular stationary states turned the system into a nonstationary oscillatory 
state, resulting from tunneling between the associated atomic quantum wells. He used the 
distance between the atoms to determine the width of the potential barrier and derived 
an expression for the beat period. The success of this theory cemented the validity of the 
tunneling concept as a genuine physical effect. 

Another area where the tunneling concept improved our understanding is related to the 
emission of electrons from a metal surface when a strong electrostatic field is applied to 
accelerate electrons toward the metal’s surface. This type of emission is known as Fowler- 
Nordheim tunneling, after the two scientists who showed that the effect is purely quantum 
mechanical and can be explained by considering tunneling of electrons through the poten- 
tial barrier at the metal’s surface [340]. Their simplified model described the effect as a 
simple one-dimensional problem where the electrons tunnel through a thin barrier sub- 
jected to a uniform electrostatic field perpendicular to the surface of the metal. The metal 
was modeled as an ideal Fermi gas, a method pioneered by Sommerfeld, and the thin 
barrier was modeled as a rectangular potential barrier; such a barrier is now used in text- 
books for understanding the tunneling phenomenon [46]. The resulting solution showed 
that the tunneling probability has exponential dependence on the tunneling distance, which 
depends inversely on the applied electrostatic field. This type of field emission has found 
practical applications in solid-state electronic components such as tunneling diodes [341]. 

The operation of Schottky junctions, used for rectifying the flow of current, is also based 
on tunneling. Until the end of World War II, many attempts were made to relate the current 
flow in a metal—semiconductor junction (a Schottky junction) to the tunneling of electrons 
in solids. But the models were not realistic enough, and theory predicted current flow in the 
direction opposite of the observed one [341]. Unlike the field emission effect, where current 
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flow is a result of the tunneling of electrons through a thin surface barrier, the electrostatic 
field lowers the surface barrier height in a Schottky junction, enabling electrons to get over 
the barrier. 

The invention of the transistor in 1947 rekindled the interest in tunneling of electrons in 
rectifying junctions made with doped semiconductors. A significant advance occurred in 
1958 when Esaki made tunnel diodes that were capable of oscillating at frequencies as high 
as 100 GHz [342] (see Aside 6.1). Such a tunnel diode exhibits a negative differential resis- 
tance in its voltage-current (V-I) characteristics over a wide range of frequencies. It consists 
of a heavily doped p-n junction (only about 10 nm wide), inside which interband tunneling 
takes electrons from the valence band to the conduction band. This behavior showed con- 
clusively that electrons can tunnel between two metals, a topic pursued by many scientists 
for decades. Owing to a relatively high parasitic junction capacitance, tunnel diodes are 
not much used in modern devices. 


Aside 6.1 Tunnel Junctions 

A tunnel junction is essentially a capacitor with a sufficiently thin dielectric layer between 
two metallic plates through which an electron can tunnel from one plate to the other. Even 
though several methods exist for estimating the tunneling time, recent experiments indi- 
cate that tunneling is almost instantaneous [329]. Figure 6.4 shows a tunneling junction 
schematically. Two metal plates (or electrodes) of cross-sectional area A are separated by 
a short gap d (width of the junction). 


When a voltage Vı2 is applied across the tunnel junction, the Fermi levels jz; and u2 of 
the electrodes become separated by qe V12 such that 2 = 1 — geVj2. If Ury is the height 
of the potential barrier between the two electrodes, the average transmission probability 
Tr, for an electron of mass me can be obtained from Eq. (6.36). Assuming Ury is nearly 
constant and much larger than E, the integration can be easily done to obtain 


2d 
Try = exp (Fy 2Me Ur) : (6.7) 


Physically speaking, Uyy is the difference between the work functions of the metal used 
for the electrodes and the insulator used for the barrier. The occupation probability of the 
energy states in electrodes 1 and 2 is governed by the distribution functions, fı (E) and 
J2(E), respectively. Using them, the current flowing between the two electrodes is given by 
(see Chapter 5) 


0O 
h2 « -qATn f Dosi\(E)Dos2(E) 
—00 


x (AÐU -AI - AU fil) dE, (68) 


where the product f|(£)[1 — f2(E)] indicates that an electron in a filled energy state of 
electrode | can only tunnel to electrode 2 if there is a vacancy in the corresponding energy 
level to receive it. The term f2(£)(1 — fı (E)) corresponds to tunneling of electrons from 
electrode 2 to electrode 1. These probabilities are multiplied by their respective density of 
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Schematic showing operation of a tunnel junction. Two metal plates (electrodes) are separated by a short gap (width 
of the junction). Their filled energy levels are shown by a set of dense lines, while two dashed lines indicate their 
Fermi levels. 


states and then integrated over all available energy states. The net current is related to the 
difference of the tunneling rates of electrons in the opposite directions. 


If we assume that the two electrodes are in thermal equilibrium, then the distribution func- 
tions are just the Fermi—Dirac distributions (i.e., f((E) > frj(E) for j = 1,2). The current 
is then given by 


EEn / Dost (E)Dos2(E)Ufri(E) — fr(E) dE. (6.9) 


The integration over energy can be carried out if we replace the density of states for each 
electrode with its value at the Fermi energy Er, which is a valid assumption, as we have 
seen in Chapters 3 and 5. The result is then given by 


Vi2 
l2 X —gGeAT7yDos\ (Er )Dos2(Er (1 — 42) = =, (6.10) 


Rry 
where we used (u1 — u2) = qe V12 and defined the tunnel-junction resistance R77 such that 
Ohm’s law can be used to describe the current flow through the tunnel junction. Our result 
provides an expression for Ryy in terms of the physically relevant parameters of the tunnel 
junction. Note that it does not depend on the operating temperature of the device. 


If a tunnel junction is biased with a constant current J), it will sustain this current flow 
through oscillations at the frequency fr = 1I21/qe because electrons need to tunnel one 
by one through the junction’s barrier. Owing to its capacitive character, a tunnel junc- 
tion continues to accumulate charges until the time it becomes favorable for an electron 
to tunnel through the barrier. Further details on single-charge tunneling can be found in 
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Refs. [343, 344]. We will see later in the context of Josephson junctions that the oscil- 
latory behavior still occurs but only at an oscillation frequency of f7/2 (called the Bloch 
frequency). The 50% reduction in the oscillation frequency is related to the tunneling of 
Cooper pairs, which have twice the charge of a single electron. 


In a 1960 experiment, it was observed that the V-I characteristic of a tunnel diode 
changed from a straight line to a curve when one of the two metal electrodes was made 
of a superconducting material [345]. This was found to be related to the feature that all 
superconductors exhibit an energy gap Eg centered at the Fermi level. As a result, no cur- 
rent can flow until the applied voltage reaches a value V = Eg/(2qe). This feature made it 
possible to measure the magnitude of Eg for superconductors with sufficient accuracy. The 
energy gap plays an important role in the theory of superconductivity based on the pairing 
of two electrons; such pairs are called Cooper pairs. The theoretical work of Josephson in 
1962 predicted that a supercurrent resulting from the movement of Cooper pairs can flow 
across a thin layer of insulating oxide, serving as a barrier between two superconductors. 
The supercurrent is in addition to the current found by Giaever [345] and results from tun- 
neling of electrons in pairs [346] (see Aside 6.2). This effect is known as the Josephson 
effect and occurs in two different forms. In the case of the DC Josephson effect, a constant 
current flows across the junction even without applying any electric or magnetic fields (no 
bias voltage across the tunnel junction). In the case of the AC Josephson effect, a constant 
voltage is applied across the junction, and the resulting current oscillates at frequencies in 
the range 10 to 100 MHz. 


Aside 6.2 Josephson Junction 


According to the theory of Bardeen, Cooper, and Schrieffer (BCS), superconductivity 
results from the movement of Cooper pairs [347, 348], which form through pairing of 
two electrons with the same charge but opposing spins and momenta. Unlike electrons, 
which are fermions, Cooper pairs have a net spin of zero and thus act as bosons. As we 
have seen in Section 2.4, while only two fermions can occupy a quantum state, an infinite 
number of bosons could be in the same quantum state. As a result, all Cooper pairs inside a 
superconductor can be described by a single wave function and have the same phase. This 
phase coherence of Cooper pairs plays a critical role in Josephson junctions. 


As shown schematically in Figure 6.5, a Josephson junction is made with two supercon- 
ductors with a thin layer between them. This layer is typically an insulator but can even 
be made of a metal or a semiconductor, as its role is only to provide a potential barrier 
for the Cooper pairs inside two superconductors. If the barrier is sufficiently thin, Cooper 
pairs can tunnel through it, and a current would flow through the Josephson junction. The 
tunneling can be understood as follows. The wave function W; of the superconductor 1 
does not vanish at the barrier but decays exponentially inside the barrier. If its magnitude 
remains finite at the other end of the barrier, it can couple with the wave function Y2 of 
the superconductor 2, as seen in Figure 6.5. We denote this quantum-mechanical coupling 
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Schematic of a Josephson junction. Two superconductors are separated by a thin insulator providing a potential 
barrier for tunneling of Cooper pairs. The wave functions W4 and W3 of Cooper pairs become coupled weakly through 
the barrier. 


of the two superconductor wave functions by «ys. It can be viewed as a mechanism that 
maintains the phase coherence of Cooper pairs over the entire Josephson junction. This 
long-range phase coherence is fundamental to the Josephson effect. 
Mathematically, the Josephson effect is governed by the following two equations known 
as Josephson equations [349]: 

OO evn), 2) = kesing, (6.11) 

dt h 
where Vj2(f) is the applied voltage, 71—2(t) is the current flowing across the Josephson 
junction, and ¢(t) is the phase difference between the wave functions in the two super- 
conductors. The quantity Icc is known as the critical current and is proportional to the 
coupling constant «js. The critical current is an important phenomenological parameter of 
the device; it can be controlled by varying temperature or by applying a magnetic field. 
The physical constant 2qe/h is known as the Josephson constant, and its inverse, h/(2qe), 
is called the magnetic flux quantum. It is possible to combine the two Josephson equations 
and obtain the following expression for the tunneling current across the device: 
2qe 


t 
D2(t) = [cc sin (7 Í Via(t) dt + t) , (6.12) 
0 


where ġo is an integration constant that depends on the initial setup of the junction. This 
result shows that a Josephson junction can be operated in the following two regimes: 


e DC Josephson effect: Even in the absence of any voltage (V12 = 0), a current flows 
through the Josephson junction because of tunneling of Cooper pairs through the thin 
barrier. This current is given by 


I.2(t) = Ice sin (ġo) - (6.13) 


This result shows that the magnitude and direction of the current are set by the initial 
value of the phase difference ġo between the wave functions in two superconductors. If 
o # 0, a finite constant current flows through a Josephson junction even without any 
applied voltage. Depending on the initial phase value, this current can take any value 
between —Icc and Fec. 
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e AC Josephson effect: If a constant voltage Vo is applied across a Josephson junction, 
the Cooper pairs tunneling through it generate an oscillating current such that 


; 2qe 
T_,2(t) = lec sin Eae + do). (6.14) 


Being a sinusoidal current without a DC term, the current oscillates between —J,- and 
Tec such that its time average vanishes. The oscillation frequency of this AC current is 
given by w = (2qe/h)Vo, indicating that w is proportional to Vo and can be controlled 
through the applied voltage. For this reason, the AC Josephson effect can be used as a 
voltage-to-frequency converter. 


The Josephson effect is affected considerably by the presence of a magnetic field. As a 
result, a Josephson junction can be transformed into a Giaever tunneling junction, also 
called a superconducting tunneling junction, by applying a small magnetic field [350]. 
Such tunneling junctions have found applications as sensitive detectors of electromagnetic 
radiation, capable of operating in a wide frequency range from the infrared to the X-ray 
region. Such detectors exploit the ability of a superconducting tunneling junction to detect 
photons with an energy approximately equal to twice the value of the gap parameter of the 
material of the junction. Thus, arrays of superconducting tunneling junctions can be used to 
construct highly accurate spectrometers. Another application employs the high sensitivity 
of such a junction to magnetic fields to measure extremely weak magnetic fields [351] 
through a device known as a superconducting quantum interference device (SQUID). 


The current advances in this field are such that one can monitor the motion of individ- 
ual hydrogen atoms on a metal surface by using a scanning tunneling microscope. One 
could invoke classical arguments to argue that the motion of atoms would be inhibited at 
low temperatures owing to thermal diffusion effects. However, it was found experimen- 
tally that atoms remain mobile down to temperatures as low as 9K [352]. Moreover, the 
tunneling rate of atoms through the metal’s surface was found to increase as temperature 
was lowered! Quantum tunneling continues to lead to new advances as we synthesize new 
materials and invent novel quantum devices. 


6.2 Quantum Description of Tunneling 
a a SS eae 


Tunneling through a rectangular potential barrier is a common textbook problem [96, 341]. 
We begin this section by considering a more general scenario shown in Figure 6.6, where 
the shape of the barrier is not rectangular but varies in a continuous fashion in the region 
zı < Zz < z2. This region is classically forbidden for a particle whose energy Æ is less 
than V(z) over the entire region. We first discuss the Wentzel-Kramers—Brillouin (WKB) 
method used extensively in quantum mechanics and then use it to solve the tunneling 
problem. 
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energy 


Schematic showing a nonuniform potential barrier V(z) for a particle of energy E. The classically forbidden region is 
confined toz} < Z < 3). 


6.2.1 Wentzel—Kramers—Brillouin (WKB) Method 


The WKB method is useful for finding the wave function of quantum particles whose 
potential energy varies slowly relative to other length scales [353, 354]. More precisely, the 
potential energy remains almost constant over the de Broglie wavelength of the particle. In 
this situation, both the amplitude and the phase of the wave function vary slowly over this 
length scale. The WKB method exploits this feature to find an approximate solution for 
the wave function. In quantum mechanics, the WKB method is often used for calculating 
energies of the bound states as well as tunneling rates through a potential barrier. For 
simplicity, we focus on the one-dimensional case here. The method can be extended to 
multiple dimensions, but the resulting equations are not always analytically solvable. 
Consider the motion of a particle of mass m subjected to a slowly varying potential V(z). 
As discussed in Section 2.2, the energy eigenstates of this particle are found by solving the 


Schrédinger equation 
R aw 
— > + VQ) (z) = EV(z), (6.15) 
2m dz? 


where F is the energy of a stationary eigenstate. This equation can be written in the form 


where P(Z) is defined as 


we Im[E — V(z)], if E > V(z) (6.17) 
pee i/2m Vm — E], if V(z) > E. 


Clearly p(z) can be interpreted as the classical momentum of the particle when its value 
is real. 
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The WKB method solves Eq. (6.16) with the ansatz, 


teda (5) (6.18) 


where both the amplitude A(z) and phase S(z) are real functions. Substituting this ansatz in 
Eq. (6.16) and equating the real and imaginary parts, we obtain two equations: 


A Apd? 5 
a- all) PO] =o (6.19) 
IA 2 
o ead ai (6.20) 
dz dz dz? 


These equations can be solved approximately if A varies slowly enough that its second 
derivative in Eq. (6.19) can be neglected. This equation then leads to 
dS(z) 
dz 


Zz 
= tp(z) > SR) = + f p(z) dz, (6.21) 


where z; is the initial position of the particle. Using this result in the second equation, 
amplitude is found to be given by 


£(4%p) =0 > A@ = (6.22) 


C 
dz PO 


where C is a constant that needs to be determined. Since the phase equation has two 
solutions, the general solution of Eq. (6.16) is a linear combination of these two solutions. 

As an example, consider the solution in the classically allowed region where E > V(2). 
In this case p(z) is real and positive and the general solution has the form 


o Cy S )+ C= (- j ah ). 5 
(z) = nC exp ¢ ‘ D(z) dz T exp i D(z) dz (6.23) 


In the classically forbidden region, V(z) > E, Eq. (6.17) shows us that p(z) is purely 
imaginary. Using p(z) = i|p(z)|, the general solution takes the form 


1 z 
(z) = Jra sea ow (- if pold) + ss J E - pold). (6.24) 


However, there is another situation not covered by the above two scenarios; it corre- 
sponds to E ~ V(z) or p(z) © 0. As p(z) appears in the denominator of the preceding 
solutions for W(z), we have a singular situation near p(z) = 0. The z values where p(z) 
vanishes are known as the turning points because a classical particle stops and reverses 
its direction of motion at those points. In quantum mechanics, the behavior of the wave 
function changes from being oscillatory to decaying exponential at a turning point. To gain 
further insight, Figure 6.7 shows the potential V(z) near a turning point, assumed to be 
located at z = 0. In the region on the left of the turning point (z < 0), the solution is a lin- 
ear combination of the forward and backward propagating waves, whereas it must decay 
exponentially in the region on the right of the turning point (z > 0). Thus, the general 
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Schematic showing the potential V(z) near a turning point located atz = 0. At the turning point, solution of the wave 
function changes from being oscillatory to exponential and it should be forced to be continuous in the patching region. 


solution can be written as 


ee Ta exp (i f,P@ dz) + re = (-i Sap dz) , ifz<0 E 
Ta P (-i S; [p(z)| dz) , ifz>0. 

The three constants are found using the normalization condition and forcing the require- 

ment that the wave function W(z) and its derivative dY /dz should be continuous at the 

turning point z = 0. 

One way to enforce the continuity requirement is to linearize the potential V(z) around 
this turning point at z = 0 using V(z) ~ E + Vaz, where Vg is the value of dV /dz at z = 0. 
If we introduce a new variable x = az witha = (2mVa/h-)'/?, we can write Eq. (6.16) in 
the form 


a 

= V(x) — x(x) = 0. (6.26) 
dx 

This is the Airy equation and its general solution can be written in terms of the Airy 

functions Ai(x) and Bi(x) as 


W(x) = C4 Ai(x) + Cp Bi(x), (6.27) 


where the constants C4 and Cg must be determined from the boundary conditions. This 
solution must match the WKB solution for values of z outside the patching region in 
Figure 6.7. However, the Airy function Bi(x) grows exponentially for large values of x, 
whereas the WKB solution requires exponential decay. Because Ai(x) decays exponentially 
for large values of x, matching becomes possible if we set Cg = 0. Using the well-known 
asymptotic forms of Ai(x) for positive and negative values of x, the wave function has the 
following functional form: 
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jaz|-/4 | Caexp[-—F(az)7/?]_—ifz > 0 
x i . 
2/1 Ca+ exp (Flac?) + Ca_ exp (—#laz/”) ifz <0, 
(6.28) 


where the constants are determined by matching this form to the asymptotic form of the 
WKB solution: 


Wairy (2) = 


Cw exp [-3e0*? ifz>0 


Cw-+ exp (žlez? + Cw- exp (-31az1%/2) ifz «0. 
(6.29) 


Wwxa(z) = læ?z7!4 x 


6.2.2 Tunneling in the WKB Approximation 


We now apply the WKB approximation to the tunneling problem. We assume that the 
potential V(z) exceeds the particle’s energy E in the entire region zı < z < z2. This is 
not a severe restriction because the barrier can always be partitioned into several segments, 
each satisfying the preceding condition. Even though classically forbidden, there is a finite 
probability of the particle’s tunneling through the barrier that we want to calculate. 

As discussed in the preceding section, the Schrödinger equation can be solved analyti- 
cally in the WKB approximation. The resulting wave function has different forms in three 
regions of Figure 6.6: 


Ax exp (iv 2mEz) + A_ exp (-iv 2mEz) ifz < z, 
vOs] Ae (4 P@laz) ifa <z <z (6.30) 
C4 exp (iv 2mEz) ifz > z, 


where p(z) in the barrier region is given in Eq. (6.17). On physical grounds, a backward 
propagating wave must be included in the region z < zı but not in the region z > z2. 
The four constants, A+, A—, B+, and C4 are determined using the continuity of the wave 
function and its derivative at the two boundaries located at z = zı and z = z2. It is important 
to note that we only kept the exponentially decaying solution in the barrier region because 
the exponentially growing solution cannot be sustained for wide barriers without violating 
the conversation of energy and momentum principles. 

Invoking the continuity of Y(z) and its derivative dW/dz at z = zı, we obtain the 
relations 


Ay exp (;v2mEa) + A_exp (-;v2mé::) = (6.31) 


1 
— B 
Vp@l 


Ay exp (+ v2 zi) — A_ exp (-;v2m Ba) =i pg, (6.32) 


Similarly, the continuity of W(z) and dY /dz at the boundary z = z2 provides the relations 


By 1 22 B i 
ne) exp ( i [ [p(Z)| de) = C4 exp ($ v2mEz) (6.33) 
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B, exp (-; T Ip(z)| de) =-i ae exp (5 /2me= ‘ (6.34) 
ħ Jz Ip(z2)| h 


We can solve these four equations to find the four coefficients and obtain the wave function 
W(z). 
The quantity of primary interest is the tunneling probability given by the transmissivity 


2 
T, = = . This ratio is easy to calculate and is given by 
APD, PRT zg“ 
T, = 1+ | exp (3 IP(z)| dz) i (6.35) 
PEI | 2mE ħ Ja 


We can write the transmission coefficient in the form 
1 22 
T; = C;exp(—2y), y= if y 2m[V(z) — E] dz, (6.36) 
z1 


where the prefactor C; is ~ | and its value depends on particle’s energy E and the potentials 
at the end points zı and z2 of the barrier. In the case of a constant potential V over the entire 
barrier region, the transmission coefficient becomes 


4E 


T; = exp ( a zı) 2m(V — B) ; (6.37) 


V 


This is the main result of the theory of quantum tunneling in the WKB approximation. It 
predicts an exponential reduction in the tunneling probability of a quantum particle with 
increasing width or height of the potential barrier. 


6.3 Transfer Hamiltonian Method 
A 


An alternative approach is based on a tunneling Hamiltonian and is called the transfer 
Hamiltonian method. It is used to describe the tunneling phenomenon without making the 
WKB approximation. Historically, the tunneling Hamiltonian was introduced in 1961 by 
Bardeen to explain Giaever’s observation of tunneling in a Josephson junction [355]. It 
was further developed by Harrison [356], and formulated in its second-quantized form by 
Cohen et al. [357]. This approach has been used extensively for describing tunnel junctions 
(including Josephson junctions) and the Coulomb blockade phenomenon. The tunneling 
Hamiltonian of the whole system is divided into three parts as 


Hs = Hr + Hr +H (6.38) 


where Hy, is the Hamiltonian of the left electrode, Hr is the Hamiltonian of the right elec- 
trode, and Hy is the Hamiltonian of the tunneling barrier. The last part, Hy, is treated as a 
perturbation to the system Hamiltonian [358, 355, 359]. 

The rationale behind this method is to exploit the well-understood features of the system 
in the absence of the tunneling barrier, and use time-dependent perturbation theory to study 
the system’s response to the tunneling barrier. This method also allows us to calculate the 
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current flowing through the tunneling barrier by calculating the transfer rate of charged 
particles across the potential barrier in the forward and backward directions. Owing to the 
use of perturbation theory, the validity of this approach is limited to the cases where Hp is 
relatively small compared to other parts of the system Hamiltonian. 

In practice, the analytical expressions resulting from the application of perturbation 
theory are relatively cumbersome. One way out of this complexity is to exploit the well- 
established diagrammatic techniques used in quantum electrodynamics and many-body 
theory [131]. Besides the appealing aspect of representing perturbative expressions with 
drawings, the diagrammatic method can also be used for reasoning and problem solving. 
The easily recognizable topology of diagrams makes the diagrammatic method a powerful 
tool for constructing equations that may hold even beyond perturbation theory. With the use 
of diagrammatic techniques, the transfer Hamiltonian method has been used to include the 
many-body effects such as quasi-particle tunneling or phonon-assisted tunneling. In this 
section we only present the key features of the transfer Hamiltonian method by applying 
time-dependent perturbation theory [305]. 


6.3.1 Time-Dependent Tunneling Theory 


To understand the basics of the transfer Hamiltonian method, we consider a simple setup 
shown in Figure 6.8, where a thin rectangular-shape barrier is placed between two metallic 
electrodes with the Hamiltonians Hz and Hp. It is assumed that the left and right electrodes 
interact only through this tunneling barrier of width w occupying the region —w/2 < z < 
w/2. We also neglect Coulomb interactions among the electrons inside each electrode. The 
wave function V(t) of the entire system under these conditions can be found by solving 


energy 


potential barrier 


H, wave function H, wave function 


Left 


Schematic illustration of the tunneling-Hamiltonian approach. The tunneling barrier occupies the middle region. The 
wave functions for the left and right regions are governed by the Hamiltonians H; and Hp, respectively; double 
arrows mark their domain size. 
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the single-electron Schrödinger equation: 
aw 
ih = HsWs(t) (6.39) 
where the system Hamiltonian Hs is given by 


ne e p w 
— Im ae + UL ifz < -5> 


2 Ae s ; ) 
Hs=}-E +U if-¥<z<4, (6.40) 
R a * w 
— maz + UR ifz> 7: 


Here Up is the potential of the barrier and Uz and Up are the potentials on the left and 
right sides of the barrier region, respectively. It is difficult to solve this time-dependent 
problem exactly. We use the following strategy to construct an approximate solution by 
using time-dependent perturbation theory. 

We partition the system into two tractable subsystems by considering the left and 
right electrodes separately. We modify the potentials on each side to include the barrier’s 
potential. With this change, the potential for the left electrode takes the form 


UL, if z< —w/2 
UL(z) = { Uo, if —w/2<z<w/2, (6.41) 
0, if z > w/2. 


The potential for the right electrode has the same form except Ug(z) = 0 forz < —w/2 and 
takes the value Up for z > w/2. Note that the potentials coincide with the actual potential 
of each electrode but vanish at the other electrode. This simplification allows one to find the 
stationary solutions of the left and right electrodes and use them to approximately construct 
the time evolution of the entire system. The stationary states of Hz and Hr are found by 
solving the following time-independent Schrédinger equations: 


Hain g Wz, (2) = E Y 6.42 

Se Et UOVO = En Yin) (6.42) 
K dP Wre in = 

om de R(Z) Re (z) = Ere Yri), (6.43) 


where we used the index 7 for the left electrode and ¢ for the right electrode. With this 
notation, the eigen-energies of the two electrodes are Ezņ, and Ere. 

Consider an electron in the left electrode with energy Ezy and assume that its interaction 
via the barrier is switched on at time t = O. If there is no coupling, its wave function 
evolves as Wy exp(—iEz,t/h). In the presence of the coupling, we can express the time- 
dependent wave function s(t) of the total Hamiltonian Hs using the eigenfunctions W7,(z) 
and Wr; (z). Noting that Hs and the two partial Hamiltonians (Hz and Hp) are identical in 
the barrier region, the solutions of Hz and Hp in the overlapping regions must satisfy the 
relations 


AsV, (2) = EnV) if z< w/2, (6.44) 
HsWre(z) = Er Yre) if z> —w/2. (6.45) 
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We express the time-dependent wave function Ws(t) using these eigenfunctions and write 
it in the form 


i 
W5(t) = Wry exp (żer) + 5 KRe(t) Ure, (6.46) 
i 


where the sum is over all bound states of the right electrode. For each bound state, «pr; (t) 
represents the time-dependent coupling owing to leaking of the wave function into the 
potential barrier. To simplify the notation, we have absorbed the time evolution of the 
states Yge into the coupling coefficient. As the coupling is switched on at t = 0, the initial 
condition is kr¢ (0) = 0 for all the bound states. Moreover, the condition |Krg (t)| < 1 must 
hold for all ¢ > 0 in view of our assumption of weak coupling. 


6.3.2 Calculation of the Coupling Coefficient 
We need an expression for the coupling coefficient «pr; (t). For this purpose, we substitute 
Eq. (6.46) on the right side of Eq. (6.39) to obtain 


OWs i 
ars = HsW7y exp ($£) + 2 kre (ths Yre. (6.47) 


Replacing Hs with Hz + (Hs — Hz) in the first term and with Hr + (Hs — Hp) in the second 
term, we obtain 


. OWs i 
re = [EL + (Hs = AL) Yin exp — 7 Fiat 


+) ee OlEr + (Hs — He) re. (6.48) 
ig 


But, we can also directly differentiate s(t) with respect to t to get 


, OWs i dkpre(t) 
ha = Ern Yin exp Giza) + ih 3 dr Yre. (6.49) 


Equating the preceding two equations, we obtain the relation 


: dk i 
ih > T Wre = (Hs — HL) ¥ zy exp Gaza) 
t 
+) kre OLE, + (Hs — Hp) Yr- (6.50) 
t 


We discard the last sum in this equation owing to our assumption |Kr¢(t)| < 1. 
The next step is to multiply the preceding equation with Yke and integrate over z. 
Using the orthogonal property of the bound states Wp, the result can be written as 
., AKRe 


i 
h-a = Vie exp |- (Es — Exe : (6.51) 
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where the matrix element representing coupling between the eigenstates of the left and 
right electrodes (responsible for tunneling) is defined as 


ie f Wee (Hs — EVO de. (6.52) 


The preceding first-order differential equation can be easily integrated to obtain the 
following solution for t > 0: 


Kre(t) = Vaz exp l CE, Er)t)-1]. (6.53) 
t Ein — Ere para t 


Finally, we need to calculate the matrix element V,,-. As Hs Wp (Z) = Ern Yt) for any 
z < w/2, the lower limit of the integral can be set to zo where Zo is any value in the range 
[—w/2,w/2]. Thus, 
[00] 
Vnt =f Yi (z)(Hs — Ezy) Vr) dz. (6.54) 
20 


Noting that HsWp-(z) = Ere Ure (z) for any z > —w/2, we have the relation 


f Pn (As — Ere WR (2) dz = 0. (6.55) 
zo 


Subtracting this integral from V,-, we obtain 


Vnt = f [ wc (z)(Hs — En) Yin) = Yn (z)(Hs = Erc)WVRe (| dz. (6.56) 
£0 


Using the form of Hs given in Eq. (6.40), we can write Vz in the form 


h2 oo a d 
Vy = F [vi (z) (-; =) Wry (2) — Yan) (-3 =) Yiz o| a (6.57) 
x0 
Consider the first integral. Integrating in parts, we obtain 
99 PYL dY, ge AV dV, 
Pgz A dy = y 20 : lk. J 1 de. 6.58 
J ROT re ed. a (6.58) 


A similar expression is obtained for the second integral. Subtracting the two, the integrals 
cancel out and we obtain only the two boundary terms. The resulting expression is 


Vic = a Wy (co) ws (zo) — Yk (co) wy (zo) | . (6.59) 
Te Dan | dz BS SO dg 


As our lower integration limit was set within the barrier region (—w/2 < zo < w/2), 
this expression is valid for any value of zp falling within the barrier. Clearly, the magni- 
tude of V,,- depends on the overlap of the left and right wave functions within the barrier 
region. Even though we have done our calculation for a single variable, the procedure 
can be readily extended to two or three dimensions but with considerably more algebra 
(see Ref. [355]). 
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6.3.3 Tunneling Current 


To calculate the current resulting from tunneling of electrons across the potential barrier, 
we need to first calculate the probability of an electron tunneling through this barrier. This 
probability for an electron on the left electrode to tunnel to the bound state We in the right 
electrode can be calculated using 


/ Wr (z) s(t) dz = kre (t) + exp ($£) f Yre OYL) dz, (6.60) 


where we used W(t) given in Eq. (6.46). Making use of Oppenheimer perturbation theory, 
we can discard the second term. This amounts to assuming that the eigenstates of the left 
and right electrodes are approximately orthogonal. 

The rate Ry at which an electron initially in the left electrode ends up on the right elec- 
trode can be calculated by assuming that both electrodes are in nearly thermal equilibrium. 
In this situation, we can use their Fermi—Dirac distribution for the occupancy probability 
of a state. Summing over all the states on the left electrode, we obtain 


d o0 2 
RiR = A 2 Feel — Fre (Erc )] E We (2) W(t) dz 


2 
? 


d 
= Yo FF (EIl — Fre (Ero) |r © (6.61) 
¢ 


where F'r,(Ez,) is the occupancy probability of an electron in the state n of the left elec- 
trode and the term 1 — F'r¢ (Err) represents the vacancy probability of the state ¢ at the 
right electrode to receive this electron. Both of these probabilities are needed to satisfy the 
Pauli exclusion principle. 

We can now calculate the total current, J;_-,r, from the left electrode to the right elec- 
trode. For this, we need to consider the tunneling rate through the barrier in both directions 
(.e., ILR = de(RiR — Rrz)). The calculation of Rr, mirrors the preceding calculation of 
Rr. The resulting expression is 


d 
lR = -qez 2, Y| Fence — Fre(Ere)\ er O| 
n ¢ 


— Fry (Ere) — Frn(Ern)] kaot] (6.62) 


We calculate |rz Ok using Eq. (6.53) and obtain 


2 sin? [(Er; — Erc)t/2h] 
(Exn = Ere ve l 


2 
IkreO| = 4 |V] (6.63) 
The quantity |Kz» (t)|? is given by a similar expression. 
The final step is to replace the double sum in the preceding expression for Izp with 
two energy integrals using the density of states of the left and right electrodes. In the long 
time limit, we can also make use of the known result d(x) = lim; 0 [sin?(xt)/ (axt). 
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This result is valid in the long-time limit, but the largest value of t should be such that 
|Krz(t)| < 1 is maintained. Using it, we can write 


2 
DO [ewe (|? © E [Vye Dose (Exe )8(ELn — Er), (6.64) 
¢ 


which could be seen as an application of the Fermi’s golden rule. Here, Dosr(Erc ) repre- 
sents the density of states at the right electrode. It enters in Eq. (6.64) when we replace the 
sum over the quantum states with an energy integral. 

Using Eq. (6.64), the total current can be written as 


21 de z 2 
h | Vig | Dosr(Erc Dost(Exn) 
—00 


X Fre (Ere l — Fry (Extn) lê (Ern — Ero)| dErņ dErc. (6.65) 


ILR = 


One of the integrals can be readily done because of the presence of the delta function, 
resulting in 


l>r = 


2rige f 2 
F i Vine DosrQ)DosLO)F re @)L1 — Fry (x)] dx. (6.66) 


In this expression, the applied voltage Vir appears through the Fermi—Dirac distributions 
containing the chemical potentials of two electrodes that are related by wr = HUL — qe VIR. 
As we saw in Aside 6.1, it is possible to carry out the preceding integral through a 
linearization process and show that the current satisfies Ohm’s law. 


6.4 Sequential Tunneling 
See 


So far we have considered a simple device in which two metallic electrodes are connected 
to a thin barrier and where charges transfer from one electrode to the other through tun- 
neling of electrons across this barrier. We call this type of charge transport fully coherent 
if the electron’s transfer can be described by a single process, whose probability can be 
calculated using the Schrödinger equation. This is a reasonably accurate description when 
the average time electrons spend in the resonant state is much less than the scattering time. 
In other words, the lifetime of each energy eigenstate is much smaller than the scattering 
time. 

In more complicated devices, electrons may undergo two sequential tunneling events 
such that an electron first tunnels into a central island, and then tunnels out of this island 
after losing memory of its phase. As a result of this memory loss, the two sequential tunnel- 
ing events can be considered uncorrelated. Moreover, these two processes are not identical, 
as they face different conditions. The first tunneling process requires the availability of an 
empty state in the central island at the same energy level as in the left electrode (con- 
servation of energy) and with the same lateral momentum (conservation of momentum). 
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The second tunneling process is less constrained because of the abundance of energy levels 
in the receiving electrode. 

As we have already seen, each tunneling event can be described using first-order pertur- 
bation theory. However, at low temperatures, the sequential tunneling may be suppressed 
by a process known as the Coulomb blockade, and the conductance of the system may 
become exponentially small. In this scenario, where the first-order contribution vanishes 
or becomes insignificant, the higher-order contributions may need to be considered. An 
example of the second-order process is provided by an event where two electrons with 
different energies tunnel simultaneously (called inelastic co-tunneling), or one electron 
tunnels coherently twice (called elastic co-tunneling); the latter process is equivalent to 
simultaneous tunneling of two electrons of the same energy [8, 360]. Note that the terms 
elastic and inelastic are not related to the presence or absence of a specific scattering pro- 
cess. Rather, they refer to whether the energy is lost (inelastic) or conserved (elastic) during 
the tunneling process. 


6.4.1 A Quantum Device Separated by Two Barriers 


Figure 6.9 shows a schematic of a quantum device separated from the left and right elec- 
trodes through two potential barriers. The left electrode is at a higher potential than the right 
one. Any electron must first tunnel from the left electrode into the quantum device, spend 
sometime there, and then tunnel to the right electrode. Such sequential tunneling events 
can take place through elastic or inelastic co-tunneling processes, depending on whether 
the electron’s energy is conserved or not. In both cases, change in free energy is —qe V, if V 
is the potential difference between the electrodes. As there is no vacant state for electrons 
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Schematic of a quantum device, separated from electrodes through two barriers, showing differences between the 
elastic and inelastic co-tunneling processes. 
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to be received by the quantum device, the first-order tunneling is forbidden. However, elec- 
trons overcome this difficulty by tunneling via a virtual vacant state in the quantum device. 
Two simultaneous tunneling events via this virtual vacant state can transfer electrons from 
the left electrode to the right electrode, resulting in current flow. 

Consider an electron at the left electrode overcoming the energy mismatch in the quan- 
tum device by violating energy conservation for a short time allowed by Heisenberg’s 
uncertainty principle (see Aside 2.11). If a different electron from the quantum device tun- 
nels to the right electrode during this short duration, one can view the two tunneling events 
as charge transfer from the left electrode to the right electrode. This process is called inelas- 
tic co-tunneling, because it produces an electron-hole excitation in the quantum device, 
which is eventually dissipated through carrier-carrier interactions. Inelastic co-tunneling 
in normal-metal tunnel junctions (as opposed to superconducting tunnel junctions) was 
first observed in 1990 by Geerligs et al. [361]. Matsuoka and Kimura observed the same 
phenomenon in 1995 using a silicon quantum dot [362]. 

In the case of elastic co-tunneling, the same electron tunnels into and out of the virtual 
state inside the quantum device. In this process, the phase of the electron’s wave function 
is preserved, making elastic co-tunneling a coherent process. Elastic co-tunneling depends 
strongly on the internal structure of the quantum device. Usually, inelastic co-tunneling 
dominates in comparison to elastic co-tunneling, except at very small bias voltages and 
temperatures or when the density of energy states is very low in the quantum device [363]. 
Hanna et al. were the first to observe elastic co-tunneling in a silicon quantum dot [364]. 

The tunneling of each particle with charge g increases energy of the quantum devices by 
Ec = q°/(2C), where C is the capacitance of this device. As the electron’s energy is con- 
served during tunneling, there is no source available for this energy! The need for energy 
conservation arises from the application of Noether’s theorem to the time-translation invari- 
ance of the Schrödinger equation. However, we should recall Heisenberg’s uncertainty 
principle requiring AtAE > f/2. It implies that energy conservation can be violated 
as long as this violation lasts for a short time At set by the uncertainty principle (see 
Aside 2.11). Thus, an electron may enter and stay in the quantum device for a duration 
At ~ h/Ec, where we use AE = Ec. There exists a small but finite chance for co- 
tunneling to occur within this time window (i.e., the same electron or another electron 
should leave the quantum device and tunnel to the right electrode within a duration of Ay). 
The overall effect is that the system complies with the energy conservation principle, even 
though a co-tunneling event has taken place. 

It is useful to estimate the tunneling rates for the two co-tunneling processes. Suppose 
Tz is the tunneling rate of an electron from the left electrode to the quantum device. If the 
tunneling rate to the right electrode is pr, we can write the elastic co-tunneling rate using 
At © h/Ec, as 

Ver-et = VLE R(At) © VP Rr(h/Ec). (6.67) 
The tunneling rates lz and FR can be estimated using the theory in Ref. [365]. For an 
electrostatic energy difference AE, they are given by 
GL AE GRAE 


PU- CAE) PT gi an] CSP 


T= 
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where Gz and Gr are the conductance of two tunneling junctions and T is the absolute tem- 
perature. At a temperature near OK, only transitions that decrease the electrostatic energy 
are possible. Using AE = Ec > kgT, we obtain the simple relations 


G, AE GLEC GL 2Ec 
L= 7 7 
qe qe Goh 
_ GRAE 7 GrEc T Gr 2Ec 
qe Gq Goh’ 


, (6.69) 


TR (6.70) 


where the conductance quantum is defined as Gg = 242 /h. 

To estimate the inelastic co-tunneling rate l Cr—in, we consider two different electrons, 
the first tunneling from the left electrode to the quantum device, and the second from the 
quantum device to the right electrode. This rate can be approximately written with the help 
of Eqs. (6.67), (6.69), and (6.70) as 


~V GLG 
Ver-in ~ VILIR a as (6.71) 


This expression shows that the inelastic co-tunneling rate is low compared to the elastic 
one if both Gz and Gr are much smaller than Gg. The same condition holds even in the 
presence of the Coulomb blockade. When both of them are comparable to Gg, the inelas- 
tic co-tunneling rate becomes comparable to the single-charge tunneling through either 
junction. Under such conditions, the Coulomb blockade disappears. 

The preceding estimate for the inelastic co-tunneling rate can be improved with the 
following argument [8]. Consider an event involving two different electrons to complete 
the transfer of a single charge from the left electrode to the right electrode. During such an 
event, an electron-hole pair is created in the quantum device. This means, as a result of each 
two-electron transfer, four excitations are created in the system: a hole in the left electrode, 
an electron and a hole in the quantum device, and an electron in the right electrode. We 
label these four events using the index i = {1, 2,3, 4} and denote the corresponding excita- 
tion energies by €; > 0. If there are no energy restrictions imposed on these excitations, all 
€; are of the order of the charging energy Ec introduced earlier. Suppose the electrodes are 
kept at a potential difference V that is much smaller than the Coulomb-blockade threshold 
(GeV X Ec). As all excitations must receive their energy from this external biasing, we can 
conclude that the sum J` e€; equals qeV. Given that each e; > 0, it follows that €; < qeV 
for i = 1,2,3,4. 

Consider the number of available electronic states in a certain energy range. As this 
number is proportional to the magnitude of the energy range, the number of states available 
for the excitations is reduced by a factor © (qeV/Ec)* compared to those available in the 
range 0 to Ec. It is important to note that this factor is applicable only if the temperature is 
low enough that energy provided by the bias qeV is much higher than the thermal energy 
kpT. Thus, when kgT < qeV, the rate of inelastic co-tunneling can be written as 


GLG yy 
Vcr-inv © ces (4 ) Ec. (6.72) 
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When kgT > qeV, the excitations can appear only in the energy range of 0 to kgT, and 
this rate is replaced with 


GL Gr (keT\* 
Ver—int © ae ( 2 ) kgT. (6.73) 


In the case of thermally activated co-tunneling, there is no preferred direction in which 
tunneling is dominant. Owing to the presence of electrical bias, the electron co-tunneling 
rates from left to right and from right to left can differ only by a small factor (~ qe V /kgT) 
when kgT >> qeV. Therefore, thermally activated co-tunneling current is estimated to be 


2 
Icr™ V (Ae) (=) 3 (6.74) 
Go Go Ec 
It is clear from this discussion that thermally activated co-tunneling is an inelastic process, 
and the associated I-V curve not only is nonlinear but also depends on temperature. 

Apart from the inelastic co-tunneling process involving two different electrons, there is a 
finite probability that the same electron that enters the quantum device from the left tunnels 
out to the right. Co-tunneling in this case is referred to as elastic because this single electron 
keeps its energy constant and does not create any excitations in the quantum device during 
its transit via a virtual state. Apart from a smaller probability of the occurrence of this 
event, there is no other difference in the process, and we can adopt the same methodology 
to estimate its tunneling rate that we used earlier for the inelastic co-tunneling rate. 

So far, we have considered charging energy Ec as an energy uncertainty and related it 
to the tunneling rate Iz given in Eq. (6.69). However, in the case of elastic co-tunneling, 
we can improve our estimate of Iz by noting that a quantum device has discrete energy 
levels. If an electron transits through the device elastically, the energy uncertainty of such 
a transfer is given by AE ~ d5Egp, where dEgp is the average spacing of energy levels 
in the quantum device. With this choice, the tunneling rate from the left electrode to the 
quantum device can be written as 


ie. (6.75) 


The tunneling rate I'R from the quantum device to the right electrode is then obtained by 
just replacing Gz with Gp in this equation (see Eq. (6.70)). This reasoning shows that the 
probability of the same electron participating in both tunneling events is approximately 
given by W, As no excitations are left in the quantum device and the electron retains 
its original energy at the conclusion of the event, we should remove the extra factor of 
(qe VJEC}, which was introduced in the inelastic case. It follows from Eq. (6.72) that the 
co-tunneling rate at a small bias voltage V is given by 


GL G V ô 
Teran ee 22, (6.76) 
Go Go Ec Ec 


Comparing it with Eq. (6.72), we conclude that the elastic co-tunneling dominates at low 
energies such that AE < ,/dgpEc. 
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6.4.2 Scanning Tunneling Microscopy 


As an important application of co-tunneling, we consider scanning tunneling spectroscopy. 
This technique provides a direct way of measuring spatial variations in the density of elec- 
tronic states at a metallic surface within the energy window set by the bias voltage between 
the surface and a sharp metal tip. The measurement is carried out by scanning the metal 
tip over the conducting surface, while maintaining a small gap between the two. This gap 
(typically <1 nm wide) is chosen such that electrons can tunnel from the tip to the surface, 
resulting in a current flow. Binnig and Rohrer invented in 1983 the instrument known as the 
scanning tunneling microscope (STM) and used it to reconstruct an image of the atomic 
structure of a silicon surface [366]. For this work, they were awarded the 1986 Nobel Prize 
in Physics. A detailed review of the STM-based technique can be found in Ref. [367]. 

Figure 6.10(a) shows schematically the major parts of an STM. A scanning tip is 
mounted on a piezo tube whose mechanical deformation is controlled by an external volt- 
age. As a result, the tip can be precisely positioned on the sample in both the lateral and 
vertical directions. The current through the vacuum gap is set to a specific value at the start 
of the measurement process. As the tip is raster-scanned over the surface, the tunneling cur- 
rent is kept constant by varying the tiny gap between the tip and the sample. Changes in the 
vertical position of the tip are used for creating an image of the surface. These images con- 
tain information about the geometry and the electronic structure of the surface. The spatial 
resolution of an STM is remarkable owing to the extremely high sensitivity of the tunnel- 
ing current to the tip-sample distance. Recent advances in scanning tunneling microscopy 
are such that spatial features shorter than 0.1 nm can be resolved using a commercial STM 
equipment [368]. 

We can understand the operation of an STM by considering the tunneling current through 
the narrow gap between the metal tip and the surface at a fixed voltage V. If this voltage 
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(a) Schematic of an STM showing its major parts; (b) energy diagram of the tunneling gap between the sample and 
the STM tip. 
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is small compared to the work function ®w of the tip surface or the sample surface, the 
tunneling barrier has a shape similar to that shown in Figure 6.10(b). The work functions 
of the two surfaces set the barrier height, as seen there. In the WKB approximation, the 
wave function V,,(z) of an electron of energy En inside the potential barrier ®;(z) can be 
written as (see Section 6.2.1): 


Wn(z) = Yn(0) exp (-; J v 2me[Pp(Z) — En] de) > (6.77) 
Zi 


where z is the distance along the potential barrier (z = 0 at the start of the barrier). The 
integral can be evaluated for an arbitrary potential barrier using the saddle-point method 
(see Ref. [369]). 

In scanning tunneling microscopy, a small bias voltage Vp is applied to initiate a tun- 
neling current. Noting that the work function of a metal represents the additional energy 
required for an electron to leave the metal relative to its Fermi energy, the height of the 
potential barrier can be approximated by the average of the work functions of the sample 
and the tip as 


1 
D(z) © z Wip F Wsample) x Oy, (6.78) 


where ®w has a constant value. With this approximation, the potential barrier becomes 
rectangular in shape, and the integration can be done without needing the saddle-point 
method. If we also assume En << Py, the wave function takes the form 


Wy (z) © Wy(0) exp (—=V2m.w) (6.79) 


The tunneling current /7(z) is proportional to the probability of electrons tunneling 
through the barrier. Summing over all possible energies, it can be written as 


Er 
2 
Ip(z) « -qe X` |Wn(0)|? exp (-Z Vama) A (6.80) 


En 


where Ep is the Fermi level and the lower limit E,, depends on the applied bias voltage as 
En = Er — qe Vp. We can carry out the summation using the definition of the density of 
states: 


E 
i 
Dos(E,0) = lim = J (W0. (6.81) 
Enj=E-€ 


AS qeVp & Er, we approximate the sum using the Taylor expansion of the function 
Dos(Er — qeVp, 0) around the Fermi energy Erp and retain only the first two terms. The 
result is given by 

Ef 

XO [WO]? ~ —GeVbDos(Er, 0). (6.82) 

En 


Substituting this expression back in Eq. (6.80) we obtain 


2 
Ip(g) & q2VpDos(Er, 0) exp (-Fv2mow) (6.83) 
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This equation shows that the tunneling current is proportional to the product of the local 
density of states of the sample at the Fermi energy and the applied voltage. Note also that 
the tunneling current obeys Ohm’s law (i.e., the current is proportional to the voltage). It 
is evident from this result that an STM image does not represent the position of surface 
atoms. Rather, it is a map of spatial variations in the density of electronic states at the 
surface, within the energy window set by the bias voltage. The exponential dependence 
of this current on the gap distance means that the tunneling current is dominated by the 
shortest path from a single atom at the end of the scanning probe. This is the reason why 
the spatial resolution of an STM can be better than 0.1 nm. 

Depending on the polarity of the applied bias, electrons can flow from the metal tip to the 
sample surface, or vice versa. It is possible to exploit this feature to gain more information 
about the density of states of in the sample. For example, when the tip is negatively biased 
with respect to the conductive sample, electrons can only tunnel from occupied states of 
the tip into empty states of the sample, within the energy window qe Vp. On the other hand, 
when the tip is positively biased, tunneling of electrons occurs from the sample surface. 
The measured tunneling current in this case originates from the occupied valence-band 
states in the sample. Therefore, depending on the bias direction, occupied or empty states 
of the surface can be probed [370]. 


6.5 Resonant Tunneling 
SS | 


The resonant tunneling differs from the generic tunneling process by the presence of quasi- 
bound states within the potential barrier that are classically forbidden for the tunneling 
particle. Tunneling is facilitated if energy of the tunneling particle matches approximately 
the energy associated with one of these quasi-bound states. As a result, the tunneling trans- 
mission coefficient peaks sharply and becomes close to unity when the electron’s energy 
is close to the energy of a quasi-bound state. This resonance phenomenon is similar to the 
resonances seen in a Fabry—Perot resonator. 

An excellent example of the resonant tunneling is provided by a device called the reso- 
nant tunneling diode (RTD) and shown schematically in Figure 6.11(a). The RTD Device 
consists of a quantum-dot island (made often with GaAs) that is separated from the left 
and right electrodes by two thin barriers (made of AlAs). Electrons inside the quantum dot 
can only belong to discrete energy levels that contribute to the resonant tunneling effect. A 
voltage difference between the two electrodes can be used to modulate the location of these 
resonances, as shown in part (b) of Figure 6.11. Owing to the deformation of the potential 
barrier, one of the discrete energy levels in the quantum dot may coincide with the energy 
of the incident electron, facilitating resonance tunneling. 

In an RTD, one of the electrodes (called emitter) supplies electrons, and the quantum 
dot acts as a bandpass filter that probes the emitter through its discrete energy levels. This 
electrical engineering analogy explains why the tunneling current increases when a dis- 
crete energy level gets close to the electron’s energy in the emitter and drops when the 
increase in applied voltage moves it beyond the electron’s energy. This current drop acts 
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(a) Schematic of a resonant tunneling diode; (b) energy diagram showing distortion of the potential energy curve 
when one of the states in the quantum island is resonant with the energy of the incoming electron; (c) the V-I curve 
of the device with the alignment of energy levels at low/high current points. 


like a negative differential resistance, a property that is of immense value for high-speed 
electronics applications. The speed of such a device is limited only by its RC time constant 
and other parasitics. As a result, a RTD can operate at terahertz frequencies and act as a 
high-speed switch in nanoscale circuits. 

The first experimental demonstration of resonant tunneling was carried out in 1974 using 
a double-barrier semiconductor heterostructure [371]. Since then, several types of RTDs 
have been fabricated using different material technologies, including the I-V and II-VI 
semiconductors. There are many variants of the original structure, such as the use of heavily 
doped p-n junctions in Esaki diodes or the use of quantum wells or (quantum wires). The 
RTD structure based on the use of silicon for the quantum island and SiGe for the barriers 
is suitable for integration with modern CMOS technology. The RTD is different from other 
switching technologies because its operation cannot be explained using classical transport 
models and requires the use of quantum mechanics. Because of this, the RTDs are often 
used as a conceptual playground for exploring the physics of quantum devices. Moreover, 
an extension of the double-barrier resonant tunneling problem to the more general case 
of a periodic rectangular potential offers a simple model, known as the Kronig—Penney 
model, for the behavior of electrons in a crystal lattice [372]. As is well known, the allowed 
energies of electrons experiencing a periodic potential form continuous bands separated by 
forbidden energy gaps. 

Tsu and Esaki developed in 1973 a model capable of explaining the operation of a RTD 
[373]. Their formulation assumed equal carrier masses in both the emitter and quantum- 
well regions and derived an expression for the tunneling current through the device. This 
model was extended in 1998 by Schulman to include the effects of different in-plane 
masses in the emitter and quantum-well regions [374]. This enhancement provided an 
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explanation of the intricate features observed in the V-I characteristic that were previously 
thought to require a much higher level of theoretical sophistication. Further improvements 
have been made to the original model so that it can account for features such as band- 
bending effects, self-consistent charging effects, spatially varying effective masses, and 
quantized-emitter states. 

Consider an RTD transporting electrons via the elastic tunneling events as shown in 
Figure 6.12(a). In this situation, both the energy E and the transverse momentum kj of each 
electron are conserved because of a matching energy level in the quantum well. Part (b) of 
Figure 6.12 shows the energy-dispersion curve of the emitter (dashed curve) together with 
the available states in the quantum well (thick lines) at three voltages. In elastic tunneling, 
the onset voltage Vo is defined as the voltage at which the Fermi energy Er in the emitter 
equals the energy of the available states in the quantum well (i.e., bottom of the resonance 
subband). As the voltage is increased, more and more carriers in the emitter are able to 
map the available states in the quantum well, until a maximum voltage Vp is reached. 
The tunneling current decreases after that until the energy at the bottom of the resonance 


quantum island 


left electrode 


right electrode 
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(a) Band diagram of a RTD with the energy-dispersion curves in the emitter and quantum island showing £ versus ki. 
(b) Relative position of the parabolic energy-dispersion curve (dashed) in the emitter region at various voltages: Vo 
(onset), Vp (peak), and Vy (valley). Solid lines show the available states in the quantum well. 
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subband equals the band-edge energy of the emitter. This voltage is referred to as the valley 
voltage Vy. 

In the following discussion, mg and mow are effective masses of electrons in the emitter 
and the quantum well, respectively. The effective masses determine the intersection points 
of the two relevant energy dispersion curves, and hence affect directly the tunneling cur- 
rent. To simplify the analysis, we ignore many second-order effects such as band bending. 
Assuming that exactly half of the voltage drop occurs at the center of the quantum well, 
the three voltages (Vo, Vp, and Vy) are given by [373, 374, 375] 


2 2: 
Vo = —— (Er — Er), Vp= Er), Vy =——Enpr, (6.84) 


de Jem de 

where œ = mgw/(mogw — meg) and Ep is the energy of the resonance with respect to 
the emitter’s conduction-band edge at zero bias. If T(E, V) is the tunneling transmission 
coefficient at energy E and voltage V, the current density J(V) can be written as 
2qe 1 

h (273 
where fr (E) is the Fermi—Dirac distribution and the integration is over the entire k-space 
in the emitter. 

As mg < mow is generally true, we introduce a new energy variable U as 


wki Rki 


2mME 2mow- 


JV) = 


dE 3 
I TEV E -fE — geV)] dk, (6.85) 
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(6.86) 


With this definition, we can replace the variable E with U + Wki i/ (2mgw) and assume that 
ky varies between 0 and 420 Umr. The integration over k then leads to 


kpT 1 Ep — U) /kgT 
WV) = _ demowkal {fo T(U,V) In + exp (Er ) /kB JU 
mR 1 + exp (Er — U + qe V) /kBT 


1 + exp (Er — aU + qe V) /kBT 
T(U,V)1 du}, 6.87 
+f ( mf I + exp (Er — aU) /kgT ] ey) 


where the integration variable k, has been replaced with U. As we have seen before, several 
techniques can be used to calculate T(U, V) and use it in Eq. (6.87) to get the resonant tun- 
neling current numerically. Useful analytical results can also be obtained with appropriate 
approximations [374]. 
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The noise is the signal. 
Rolf Landauer 


7.1 Sources of Noise and Basic Concepts 
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Even though “noise” seems something that we could live without, engineers and physicists 
have learned not only to deal with it but also to exploit it. Quantum mechanics enables 
us to see the underlying physical structure of noise so that we do not have to view it as 
random glitches disrupting the operation of our devices. Instead, as we see later, the origin 
of quantum noise can be traced back to physical processes taking place inside a quantum 
device. 

Classical noise arises from fluctuations in the number of particles and in their move- 
ment within a given volume. Examples include thermal noise and shot noise. Quantum 
noise arises from uncertainty concerning the position and momentum of quantum particles. 
It can also occur when particles are photons. An example is provided by the unavoidable 
noise of optical amplifiers. There is a fundamental difference between the classical and 
quantum noises. The former is attributed to the “lack of knowledge” of the system and is 
modeled as a random stochastic process. This is because the deterministic classical view 
requires that if all the information about a system is known at some time, its trajectory 
in the relevant phase space can be predicted for all future times without any uncer- 
tainty. Contrary to this, quantum noise is fundamental in the sense that even a complete 
knowledge of the system would leave us with some uncertainty. What is more intrigu- 
ing is that quantum noise sets a fundamental limit on the operation of quantum devices. 
We can see this clearly by investigating why it is not possible to build a noiseless optical 
amplifier. 


7.1.1 Noise Introduced by Optical Amplifiers 


Following the invention of the laser and optical amplifiers, two groups considered noise 
in linear optical amplifiers. Haus and Mullen [376] analyzed amplifier noise in the two 
field quadratures, without restricting assumptions on the noise statistics. They found that 
the minimum uncertainty in the output of a high-gain amplifier corresponds to sho for 
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photons of energy fiw. A similar conclusion was reached by Heffner [377] using a different 
approach. Heffner’s method clearly shows that an ideal, noise-free, optical amplifier vio- 
lates Heisenberg’s uncertainty principle. It also shows that quantum noise is fundamentally 
different from our intuitive understanding of classical noise. 

Suppose it is possible to construct an ideal optical amplifier with no noise at its output 
when a monochromatic noisy signal of frequency w is incident on it. Let us assume that 
this amplifier receives a stream of input photons whose number is given by nz + Anz, where 
ny is the average number and An; is the uncertainty in this count. If the gain G of this ideal 
amplifier is high, the resulting output photon number will be no + dng = G(ny + Any). 
This result indicates that the input signal is linearly amplified with gain G, without any 
added noise. As this ideal optical amplifier works as a photon-number multiplier, it does 
not change phase uncertainty of the incoming signal given by Ag;. However, quantum 
mechanics demands that the uncertainty relation, AEAt > h/2, be satisfied at both the 
input and output ends of the amplifier. 

We can calculate this uncertainty at the input end using AE = hwAn, and At = Ady/o. 
The result is 


1 
AnjAd¢, = > (7.1) 


This relation is known as the number-phase uncertainty relation. It implies that the photon 
number and the phase of an optical field are canonically conjugate variables (like posi- 
tion and momentum). As a result, it is not possible to know exactly both the phase and 
the number of photons simultaneously. It is easy to show that our ideal noiseless optical 
amplifier violates the preceding relation. Because the phase remains unchanged, Eq. (7.1) 
requires AngAd@; > 5 at the output end of the amplifier. Given that Ang = GAn;, we 
get the relation An; Ad@, > 50° which clearly violates the lower bound in Eq. (7.1) for any 
amplifier with G > 1. This contradiction implies that an optical amplifier must add noise 
to its amplified signal. 

We can calculate the added noise as follows. Assume that the optical amplifier is fol- 
lowed by an ideal detector capable of counting photons. Even an ideal detector should 
obey quantum rules during its detection process. We assume that the detector has pho- 
ton number and phase uncertainties given by Anp and Agp, respectively. Noting that the 
detector and amplifier operate independently of each other, we can add their variances as 


An? = An? + And, (7.2) 
Ad” = Adz, + Adj, (7.3) 


where An? and Ag? are the photon number and phase uncertainties of the overall process. 
They can be directly related to the input signal uncertainties using An? = G? An? and 
Ad? = A¢@?. Using these relations, we can write the number-phase uncertainty relation at 
the input end in the form 


eee 1 De 4D ee De 40 Agp Anp 
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The number-phase uncertainty for an ideal detector takes its minimum value, 


AnpAdp = 5: We also impose the additional requirement that the detector’s uncertainty 
Aġp 


ratio Ani is such that it minimizes the output number-phase uncertainty. The resulting 
condition is found to be 
A A 
PO c D, (1.5) 
Aĝo Adp 
With these choices, Eq. (7.4) takes the form 
2A 42 1 or ae 
An; Ad; = g Ang Ado + z + AnoAqo |. (7.6) 


Finally, we assume that the input signal is the best-quality signal allowed by quantum 
mechanics. That is, the number-phase uncertainty relation for this signal is minimum with 
An, Ag; = 5 With this choice, the product Ang A¢o satisfies the quadratic equation 


(Ano Ago) + Ano Ago — (G — 1)/4 = 0. (7.7) 


Solving this quadratic equation and picking the positive root, we find that the output 
uncertainty product is given by 


1 
Ano Ago = >G- 1). (7.8) 


It is clear that this product exceeds the input value of 1/2 for G > 2. 

To calculate the noise power P,, we assume that the noise added by an amplifier is 
additive white noise with Gaussian statistics. If the signal is large compared to noise, 
the statistics of the output signal will also be nearly Gaussian. In this situation, both its 
power and phase fluctuate with variances given by AP? = 2PoPņn and Ags, = P;,/(2Po). 
Here P, is the noise power and Po is the average output power. It can be related to the 
output photon number as Po = hawnoBy, where By is the bandwidth of the ampli- 
fier. Using these relations in Eq. (7.8), we obtain the following expression for the noise 
power: 


Pa = shoBw(G — 1). (7.9) 
Further details about this fundamental result can be found in the original references [376, 
377]. The enhanced noise at the amplifier’s output degrades the signal-to-noise ratio (SNR) 
of the input signal. The important concepts of SNR and the noise figure of an amplifier are 
discussed in Aside 7.1. 

The preceding result shows that the noise power of an optical amplifier depends on its 
gain G and bandwidth By. The factor sho can be viewed as the zero-point energy of 
vacuum. Quantum mechanics shows us that the ground-state energy of a harmonic oscil- 
lator oscillating at the frequency w is exactly equal to sh. The concept of zero-point 
energy or zero-point fluctuations is nonclassical and has its origin in the quantization 
of an electromagnetic field. The minimum detectable noise power is often referred to 
as quantum noise in view of its quantum origin (see discussion in Ref. [378]). At short 
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wavelengths, this quantum noise can be greater than thermal noise by several orders of 
magnitude. 


Aside7.1 Signal-to-Noise Ratio and Noise Figure 


The phrase signal-to-noise ratio (SNR) has its origin in telecommunication engineering, 
but the concept is valid in many other areas including optics, biology, and financial mod- 
eling. The SNR compares the strength of a desired signal to the level of background noise, 
and it can be considered a measure of the signal’s quality. In optics, the SNR is determined 
through the signal and noise levels in the current of the photodetector used to measure the 
optical power. If the measured current is /,(t) and the variance of current noise is Wee (1), 
the SNR is defined as 


> (7.10) 


where the noise variance (AR) is found by integrating the spectral density of noise over 
the bandwidth of interest. It is important to note that the SNR is a ratio of electrical power 
levels rather than a ratio of optical power levels. 


Just as the SNR serves as a figure of merit to characterize the quality of a signal, the 
quality of optical components including amplifiers is characterized using a measure called 
the noise figure (NF). The reason is that an active component such an optical amplifier may 
use an excited state to transfer energy to the amplified signals via stimulated emission or 
analogous mechanisms. However, spontaneous emission cannot be avoided, and the signal 
gets contaminated by noise as a result. The noise figure enables us to evaluate the impact 
of an active component on the SNR of a signal passing through that component. It does 
this by comparing the input and output values of the SNR. 


Using the nomenclature of the International Electrotechnical Commission (IEC 61291- 
1:2018), the noise factor NF is defined as 

SNRin 
SNRout | 


NF(fop.f) = (7.11) 
where SNR;in is the SNR at the amplifier’s input end and SNRou is the SNR at its output 
end. In general, noise factor is a function of both the optical frequency fop and the baseband 
frequency f [379]. The noise figure (F) is the noise factor expressed in decibel (dB) units: 


F (dB) = 10logyglNF(fop.f)I- (7.12) 


For a telecommunication link consisting of lossy fiber spans with optical amplifiers at 
periodic intervals, the noise figure for the entire chain can be computed from the noise 
figures of individual amplifiers. If the amplifier in the nth section provides the gain G, to 
exactly compensate its span loss L, and SNR, denotes the SNR after the nth amplifier, the 
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total noise factor is given by 


NF... = SNRo = (se) E) (a) 
9S SNRy  \SNR;/\SNR> SNRy /’ 

where SNRo is the SNR at the input end of the amplifier chain. The total noise figure of 

the amplifier chain is then found to satisfy the relation 

F F F 

2a + 2 be tse ebs — 

Ly L,G,Ly L,G,L2G2 ae - Ln 

Consider the special case with L, = 1 for all n € N representing a cascaded chain of N 

amplifiers with no span losses. Its noise figure is found to be 


(7.13) 


Fsys = (7.14) 


F F 
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This relation shows that the noise figure of a multistage amplifier is dominated by the noise 
of the first stage. 


Fsys = Fi + (7.15) 


7.1.2 Classical Spectral Density 


The correspondence principle, formulated in 1923 by Niels Bohr, is a distillation of his 
thoughts that led him to the development of atomic theory, an early form of quantum 
mechanics [380]. The conceptual foundation of this principle relies on the idea that quan- 
tum mechanics contains classical mechanics as a limiting case. However, the concept is 
generic in the sense that any new theory of a physical phenomenon should encompass the 
associated old theories. Examples include the special theory of relativity, which reduces to 
the Newtonian mechanics at speeds low compared with the speed of light, and statistical 
mechanics, which reproduces thermodynamics when the number of particles is large. 

The concept of classical noise is well developed and widely used. Thus, a theory of 
quantum noise should encompass all known results for classical noise. As we shall see, the 
measures of quantum noise are indeed defined such that they reduce to the classical mea- 
sures, thus complying with the correspondence principle. What is intriguing is that, even 
though we extend the classical concepts to the quantum regime, the results and interpreta- 
tions of quantum noise processes go beyond the conventional knowledge and open up new 
possibilities unique to quantum mechanics. 

It is worthwhile to review first the most fundamental properties of classical noise that we 
eventually extend to the quantum domain. Noise gets introduced to a classical device via 
its interaction with its environment. This is the reason why a model for the environment is 
first constructed for any discussion of noise of a classical device. Although it is not easy to 
find exact models for the device-environment interaction, is possible to model noise in real 
physical systems by closely studying the observed properties of a system and making sure 
that the model agrees with them. 

Suppose M(t) describes in such a model the noise process experienced by a classical 
device. A fundamental assumption in all such models is that the underlying noise process 
obeys the Gaussian statistics. This assumption is not severely restrictive because of the 
central limit theorem, which states that the sum of many independent random variables 
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is itself a random variable whose distribution converges to a Gaussian distribution as the 
number of independent variables increases. Therefore, physical quantities that are expected 
to be the sum of many independent processes (such as thermal noise discussed later) often 
have distributions that are nearly Gaussian. It is sufficient to know the first- and second- 
order moments to fully describe a Gaussian random process. The first moment, the mean 
or the average, can always be reduced to zero (i.e., we can assume (\V(t)) = 0). Thus, 
the second-order moment can fully characterize a Gaussian random process. The adopted 
measure is the autocorrelation function defined as 


TNN (t, T) = INON(t+ 7)). (7.16) 


As this definition employs two different times, the time difference t can be considered 
a measure of the memory of the underlying noise process. The maximum duration over 
which a signal retains its memory is called the correlation time. If we denote it by Tce, we 
have the condition T yy (t, Te) = 0. A signal is called white noise when te is close to zero. 

Noise in the real world has a small but finite correlation time [381]. When t, is much 
smaller than other characteristic time scales of a noisy device, it is common to employ the 
white-noise assumption. Such an assumption was adopted earlier for the Langevin noise. 
We also take the noise process to be stationary (i.e., its statistical features do not change 
with a shift of time origin). Mathematically, the statistical behavior remains unchanged 
for M(t) and \(t + to) for any value of fo. A consequence of this assumption is that the 
autocorrelation function I yy (t, T) becomes the function of a single variable t. 

It is difficult to compare two time-domain traces of a stationary random process because 
they appear quite similar. The situation is different if we compare their spectra in the fre- 
quency domain. For this reason, it is useful to analyze noise in the frequency domain. The 
Wiener—Khinchin theorem states that the Fourier transform of the autocorrelation function, 


[0,0] 


Syv(o) = FENN GN = J Tyn ()e® dr, (7.17) 


provides the frequency-domain representation of a noise process. This quantity is known 
as the spectral density. As the autocorrelation function is real valued for the noise process 
N (t), its spectral density is symmetric around w = 0 and satisfies the relation Syn (—%) = 
SNN (œ). This symmetry is used in electrical engineering where only positive frequencies 
are considered. It is common to refer to Syn (œ) as the two-sided spectral density. When 
only positive frequencies are included, the resulting spectral density is known as the single- 
sided spectral density. 


7.1.3 Thermal Noise 


An example of a classical noise process is provided by thermal noise, also known as 
the Johnson—Nyquist noise because it was first detected in 1928 by Johnson [180] and 
explained by Nyquist [181]. The origin of thermal noise lies in thermal agitation of elec- 
trons inside an electrical conductor. As the thermal agitation increases with temperature, 
thermal noise power also increases with temperature. In contrast, thermal noise power 
does not depend on the applied voltage or electrical parameters of the device such as its 
resistance. 
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It is instructive to derive an expression for the spectral density of thermal noise using a 
microscopic model [382, 383]. Even though such a derivation can be done for more exotic 
structures such as a p-n junction [384], we focus on a resistor of length L with a rectangular 
cross section of area A. Suppose the density of electrons inside the resistor is Ne, and 
these electrons have a collision time te (see Section 1.5 for details on these microscopic 
parameters). Using conduction theory in Ref. [382], we can find the conductance o and 
resistance R = L/(Ao) in terms of the microscopic parameters as 


oe NeGete Me (1.18) 
Me ANe qe Te 
The average drift speed of electrons inside the resistor can be written as 
1 
= —— is 7.19 
(u) NAL 2, uj ( ) 


where NAL is the total number of electrons within the resistor and u; is the speed of the jth 
electron. The current flowing along the resistor can be written as J = qeNeA (u). Invoking 
Ohm’s law, the voltage V across the resistor is V = RI = RqeN+A (u). Substituting the 
value of (u) from the preceding equation, we get 


v= V, Vji= uj, (7.20) 
j 


where V; is the voltage induced by the jth electron. It is a random quantity that changes as 
this electron suffers collisions inside the resistor. Using the definition of the collision time 
Te, We can write its correlation function in the form 


Pyjy(t) = (VOVE + 1)) = (V?) exp (-It1/te) (7.21) 


The two-sided spectral density of Vj is found using this form of P'y,y,(t) in Eq. (7.17). 
The integral can be easily done to obtain the result [385] 


2(V}) Te 


—. 7.22 
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Syjvj@) = 
At this point we note the Te is relatively small (< 1 ps) in a typical resistor. As a result 
wt? < 1 at frequencies as high as 10 GHz. Neglecting it and using V; = (Rq-/L)uj we 
obtain 


Svjvj(@) © 2(Rqe/L)” (uj) Te. (7.23) 


We now invoke the law of equipartition of energy, which states that each electron con- 
tributes skpT to the average energy [386]. Using sm (u?) = 5keT, we obtain (u?) = BT, 


Substituting this value in the preceding expression, we finally get 


2R°q?t. (=) 


Me 


Syvj(@) = (7.24) 


Our objective is to obtain the power spectral density of V(t), where V(t) = a V(t). 
To calculate this quantity, we invoke Campbell’s theorem; a detailed proof of this theorem 
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can be found in Ref. [159]. Campbell’s theorem states that Syy = NpSv;v;, where N, is 
the total number of electrons contributing to the sum in V(¢). In our case N, = NAL, 
resulting in 


N.Aq@?te 


Syy(@) = 2kgTR? ( ) = 2kgTR. (7.25) 


Me 
The spectral density of thermal noise depends only on the resistance R, in addition to 
temperature. It is independent of frequency (white noise) up to frequencies as high as 
10 GHz because of a short collision time of electrons inside the resistor. The single-sided 
power spectral density of thermal noise is two times that of the preceding result: 


Syv(@) = 4kpTR. (7.26) 


This result is a special case of the general connection existing between fluctuations 
and dissipations in any physical system through the fluctuation-dissipation theorem 
(see Section 3.3). If we know the bandwidth By over which thermal noise contributes, we 
can calculate the noise power associated with voltage fluctuations using Pyn = SyyBy/R = 
(4kgT)By. This power is independent of any electrical parameters of the resistor. 


7.1.4 Shot Noise 


As a second example of noise, we consider the shot noise, arising from current fluctuations 
that occur because of the discrete nature of electrical charges. Shot noise can be readily 
observed in devices such as tunnel junctions, Schottky diodes, and p-n junctions. The 
discovery of shot noise can be traced back to Schottky, who observed that vacuum tubes 
produced two types of noise, described by him as the Warmeeffekt and the Schroteffekt 
[387]. The first one is the thermal noise. The second is the shot noise observed as current 
fluctuations. The power of this noise is proportional to the average current, instead of the 
square of the current as would be expected from Ohm’s law. This feature is related to 
the random arrival times of different electrons. Even though the current is constant on 
average inside a vacuum tube, electrons arrive at the anode at random times, resulting 
in current fluctuations (shot noise). In recent years, experiments on nanoscale conductors 
have provided a better understanding of shot noise [388]. 

The spectral density of shot noise can be calculated using a simplified model where a 
stream of electrons, each having the charge qe, arrives at a detector at random time inter- 
vals. Let N be the number of electrons received at the detector in a fixed time interval T. 
The resulting current will vary with time and can be written as 


N 
I(t) = $ qeôlt = tn), (7.27) 


n=1 


where tn is the arrival time of the nth electron. In the limit of large T, this current would 
approach a constant average value given by 


(7.28) 
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Even though the current Z(t) fluctuates in time because t, is a random variable, we expect 
this random process to be stationary in the sense that the average current (/) will be 
independent of the starting time t4. 

The autocorrelation function, [y(t) = (U(ÐI(t+ t)), depends only on the time 
difference t and can be calculated using 


N N 
Pua) = q 22 D WE + r- Dae- t) (7.29) 
i=] j=1 


Assuming that this stationary random process is ergodic, we can replace the ensemble 
average with time average taken in the limit T — oo: 


È N > T/2 
l(t) = lim 2y mii +T — ti)ô(t — tj) dt 
T= œ ae i 
i= 2 
q N N 
= jim = pe Xatt- Hi). (7.30) 
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Application of the Wiener—Khinchin theorem to this autocorrelation function provides 
us with the spectral density through the Fourier transform: 


Si(w) = 1 Tuei dr. (1.31) 


By substituting I'7;(t), the spectral density can be written as the sum of two parts 


2 
Suw) = Jim * > i (te dt + > y iz S(t; +t — tel@ dt 
ae i=1 jAi 
ge N N 
= lim = N+} explio(t: — t) |, (7.32) 


i=l jżi 
where both integrals over t could be done easily. The second term is a sum of a large 
number of random complex numbers that vanishes as T —> oo. The first term is related 
to the average current in the same limit. As a result, the two-sided spectral density of shot 
noise reduces to the simple expression: 


Suw) = qe (I). (7.33) 


The single-sided spectral density is obtained by multiplying it with a factor of two. 
7.1.5 Brownian Motion 


The erratic movement of microscopic particles immersed in a fluid is called Brown- 
ian motion, after Robert Brown who first studied such fluctuations in 1827. Brownian 
motion provides an excellent example of classical noise, and its theory provides useful 
mathematical tools used to study the dynamics of nonequilibrium systems. The stochas- 
tic differential equation for describing the Brownian motion is known as the Langevin 
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equation. It contains both the frictional force and a random force that fluctuates with time. 
The fluctuation-dissipation theorem, discussed in Section 3.3, relates these forces to each 
other and provides a firm theoretical footing for describing similar phenomena such as the 
motion of ions in solutions or reorientation of dipolar molecules. Also, the theory has been 
extended to situations where the Brownian particle is not a real particle at all but repre- 
sents some collective property of a macroscopic system, which could be a quantum device 
coupled to a large reservoir in equilibrium. 

To simplify the following analysis, we consider the motion of a spherically symmetric 
particle of mass mp inside a large isotropic reservoir of fluid. In this situation, it suffices to 
consider one-dimensional motion of the particle in a given direction. The velocity V,(t) of 
the Brownian particle is affected by two forces, a friction force and a random force. The 
friction force can be described by the Stoke’s law and has the form 


F(t) = —yaVp(0), (7.34) 


where yq is a constant. The second force results from collisions of the Brownian particle 
with particles of the fluid surrounding it. Owing to the random nature of such collisions, 
this force is modeled as a stochastic process F,(t). The most commonly used assumptions 
about this force are [389]: 


1. F,(t) is independent of the velocity of the Brownian particle and has a zero mean value 
such that (F;(t)) = 0, where the average is taken over an ensemble containing the 
Brownian particle and the particles of the fluid. A consequence of the zero mean value 
is that the stochastic force has no effect on the average motion of the Brownian particle. 

2. The stochastic process F(t) obeys the Gaussian statistics and is thus fully characterized 
by its correlation function. In addition, it is a Markovian process with no past memory 
such that its correlation function has the form 


(FOF (¢)) = Got — t’). (7.35) 


The parameter G can be calculated using the fluctuation-dissipation theorem. For a 
Gaussian process, the second-order correlation function can be used to find all higher- 
order correlation functions of F,(t). In particular, all odd-order correlation functions 
vanish for a Gaussian random process. The preceding form of the correlation function 
implies the causality relation [390] 


(Vp DEFC) =0 for ¢ >t, (7.36) 


(i.e., future values of the stochastic force do not influence particle’s dynamics). 
3. Any ensemble average over the stationary random process can be replaced with the 
corresponding time average defined as 


1 T 
(fIFs(]) = im zf flFs(t)] dt, (7.37) 


where f denotes an arbitrary function of F(t). The two averages are related to each other 
through the concept of ergodicity. If the time interval T is long enough, the phase-space 
trajectory describing evolution of the system comes arbitrarily close to any specified 
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point in the phase space accessible to the system. As a result, the phase-space average 
(or ensemble average) can be replaced with the time average of the same quantity. 


With the specification of all forces, the motion of the Brownian particle can be described 
using Newton’s second law: 


dV 
Myr = Fp) + Fal) = —YaVp() + FO. (7.38) 


This linear equation can be easily integrated to obtain the solution 
1 t 
V(t) = Vo exp(— ypt) + “a / exp [—yp(t — t)] F(t) dt, (7.39) 
p J0 


where yp = yq/mp and Vo is the initial velocity of the particle at t = 0. Owing to the zero 
mean of the stochastic force F(t), the average velocity of the particle is not affected by the 
random force and is given by 


(Vp(t)) = Vo exp (—ypt) . (7.40) 
The two-time correlation function of the velocity can also be calculated as 


Pyy(t, r) = (VOV) = A exp[—yp(t +r)] 
t pt 
+ >f f exp[—yp(t+ ¢ — t — T^] G8 — t^) dr dr’, (7.41) 
mg Jo Jo 


where we have used the relation in Eq. (7.35). The evaluation of the double integral requires 
some thought because the upper limits are different for t and t’. Owing to the presence of 
the delta function, this integral is not zero only when t = 1’. If we first integrate over t’, 
the remaining integral over t must stop at t if t < t’ or at?’ if t < t. Thus, 


min(t,t’) 
(V (V(t) = Vio exp[—yp(t + t] + £ f exp[—yp(t+ ¢ —2t)]dt. (7.42) 
p 


The integration over t can now be carried out to obtain 


= 7 G 
(V,(@)Vp(’)) = Voe Yp) 4 amy) [exp( Yp|t — r'|) — exp( Yplt + rD]. (7.43) 
prP 


It is clear from this expression that the two exponential terms containing t + ¢ will become 
negligible for values of ¢ and r larger than 1/y,. In this long time limit (yt >> 1 and 
Ypť > 1), the correlation function becomes independent of the initial velocity and depends 
only on the time difference |t — t| as 


G Ya 
(VOV) = exp | t=ril, (7.44) 
2mMpYd mp 
where we used the relation yp = ya/mp. 

We can calculate the average energy (E) of the Brownian particle as 


1 G 
(E) = 5p (VO) = (7.45) 
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However, from the equipartition law of classical statistical mechanics (E) should be related 
in thermal equilibrium to the temperature T of the reservoir (fluid) as (FE), = skpT. 
Equating these two energy expressions, the noise-strength parameter G is found to be 


G = 2yakpT. (7.46) 


This result can be viewed as a consequence of the fluctuation-dissipation theorem discussed 
in Section 3.3. Thermal equilibrium demands that a balance must exist between friction 
(related to yz), which tends to attenuate the particle’s speed, and fluctuations (related to G), 
which tends to keep the particle moving. 


7.1.6 Generalized Einstein Relation 


The Langevin equation (7.38) obtained for the Brownian motion is not in a standard form 
of this equation. The standard form for a set of Langevin equations is 


d 
qe = Di + Fuld, (7.47) 


where D,,(t) is referred to as the drift term and F, (£) is the Langevin force. The index u 
acts as a level when multiple Langevin equations are needed for a system requiring several 
physical variables for its description. As before, the Langevin force F,, has zero mean and 
is modeled as a Markovian stochastic process with the correlation function 


(F OF u(t’) = 2Dyyd(t — t). (7.48) 


The constant D,,, is called the diffusion coefficient of the Langevin force. The generalized 
Einstein relation allows one to express D,,, in terms of four quantities: D,,(¢), D,(), Au), 
and A,(f). 

We start the derivation by using an identity relating A, at two different times through 


the integral 
t 


Ault) = Ay(t— At) + f 


t— 


d 

—A,,(t') dr’. 7.49 
ge) (7.49) 
We multiply this equation with F,,(¢) and perform ensemble average to obtain 


d d 
(FnQMAp()) = (FOAL lt — Ad) +f k (FO Ault) dt’. (7.50) 
t—At 


Using the Langevin equation (7.47) for the time derivative, we get 


t 
(F ÐA L(t) = (FOA ult — Ad) +f : (F (QD u(t’) + Fue) dt’. (7.51) 
t-At 


The first term vanishes, (F (t)A (t — At)) = 0, owing to the causality requirement 
that A,,(t — At) cannot be affected by the stochastic force F(t) at a later time. Similarly, 
(F,()D,(1)) is zero except when f = t; its integral evaluates to zero because the nonzero 
value occurs over a set of measure zero. The remaining integral can be simplified using 
t = fť — t to obtain 

0 


t 0 
f : (F OF t) dt’ =) (F OF ut + t)) dt =f (F (OF (1)) dt, (7.52) 
t—At 


—At —0o 
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where the lower limit was extended to infinity in view of Eq. (7.48). We also used the 
stationarity of F’,(¢), which ensures that any correlation function depends on the time 
difference of the two times involved. Collecting these results, we obtain 


1 OO 
(F,(QAp()) = >| (FOF u(t)) dt. (7.53) 


Using the correlation function in Eq. (7.48), we obtain the result 
(F (DALO) = Dru- (7.54) 
Using the same procedure, we can also show that 
(A, OF.) = Dru- (7.55) 
We use these two results to calculate the derivative of the correlation function 
(Ap@A,@): 
T ADAL) = (Zt A, (0) + An) 24) 
dy ARTESA gee "` dt 


(LDO + Enr OJALO) + Ar ODLA + FO) 
= (D OAL O) + (An ODO) + 2Dyp- (7.56) 


This result is called the generalized Einstein relation when written in the form 


d 
2Drnu = a (An (Ap) — AOD O) — (Dn OAO). (7.57) 


It enables us to express the diffusion coefficient Dy, of the random force in terms of the 
drift parameters D} and D, and the time derivative of the correlation function (A,(t)A,,(0)). 
It can be viewed as a manifestation of the fluctuation-dissipation theorem discussed in 
Section 3.3. 

The classical analysis of this section can be easily extended to the quantum domain. 
In the quantum version of the Langevin equation given in (7.47), the variables A,,, Dp, 
and F, become Heisenberg operators. As a result, an ensemble average denotes quantum 
averaging over the initial state of the underlying quantum system. The generalized Einstein 
relation in Eq. (7.57) remains valid with this interpretation of the averages appearing in this 
relation. Also, the quantum Langevin equation provides an approximate method for solving 
the quantum master equations [391] of this system. We present in Aside 7.2 the quantum 
regression theorem that is useful for calculating the two-time correlation function of A,,. 


Aside7.2 Quantum Regression Theorem 

In the derivation of the generalized Einstein relation in Section 7.1.6, we calculated the 
correlation function (A,()A,,(¢)) involving a single time. The two-time correlation func- 
tion, defined as (A,(t)A,(’)) where t < t, provides more information, and its knowledge 
is desirable. However, knowledge of the density matrix is not sufficient to calculate such 
correlation functions because we also need to know the transition probabilities among vari- 
ous quantum states. The quantum regression theorem shows that, under certain conditions, 
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it is possible to calculate the two-time correlation function if we know how (A,(¢)) evolves 
with time. 

To derive this theorem, we consider the derivative of (An (ÐA u(t )) with respect to t. It is 
easy to see that 


d / dA, / f / 
P7 (An@Ap(t)) = (Ault D) = [DO + FO] Au) = (DOA), (7.58) 


If we use the fact that (F OA (£) = 0 if ¢ < t because A, (t) cannot depend on the 
future values of the random force F„(t) owing to the causality requirement, we obtain the 
quantum regression theorem in the form 


d 
P7 (An DA uC) = (DOAL) . (7.59) 


It states that the two-time correlation function has the same time evolution as the average 
(An (©) [391]. This theorem is attributed to Melvin Lax, who made many fundamental 
contributions in a series of papers on quantum noise [392, 393]. 


7.2 Quantum Spectral Density 
a a | 


The quantum version of classical noise replaces V(t) with a Hermitian operator, N (t), 
in the Heisenberg picture. The quantum form of the autocorrelation function is then 
defined as 


Py (T) = WON (t+ 1)) = WON), (7.60) 


where we assumed that the noise process is stationary. As before, the spectral density of 
noise is just the Fourier transform given by 


SNN (@) = i Twn (T)expliwt)drT. (7.61) 


It is remarkable that the quantum spectral density may not be a symmetric function of 
frequency. The reason is that it is not possible to guarantee that the quantum autocorrelation 
function Twy (T) is always real. As a result, its Fourier transform can have different values 
for positive and negative values of the same frequency. 

To build some insight into the quantum spectral density, let us consider a quantum device 
with energy eigenstates |œ) that form a complete set such that `, |æ) (a| = 1. The density 
operator p of this device is diagonal in this basis with energy eigenvalues Ew. We can thus 
expand the autocorrelation function as [394] 


Pwitt) = > Y paw EOI) (vyIN Ola) 
a y 


= 25 Pan exp Eg - Er | (ai O)ly) 7, (7.62) 
a y 
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where we used the time evolution operator U(t, tọ) = expl- ¿H(t — t)] given in 
Section 2.2.5 to replace Nit) with NO). Using this form of Twan (T) in Eq. (7.61) and 
integrating over T, the quantum spectral density can be written as 


X 1 A 
Sun) =Y Y paw2x8 (o = e Eu) (aN (0)|y) |? 
a y 


= DOD Pau Wey (7.63) 
a y 


where Wg, is the noise-induced transition rate from the state |) to |y) (in the form of 
Fermi’s golden rule). 

We can now interpret the quantum spectral density as a physical quantity describing how 
a quantum device exchanges energy with its reservoir at a given frequency. This should be 
compared with the classical spectral density, which describes how much noise power the 
device has at that frequency. Even though we used analogous definitions for classical and 
quantum spectral densities, there are drastic differences in the way they provide device- 
specific information to us. Clearly, we should explore whether we can recover all classical 
features from the quantum spectral density, as one would expect from the correspondence 
principle. 

For this purpose it is useful to partition the spectral density into its even and odd parts 
as SNN (@) = i y (©) + se y (©). This partitioning is possible for an arbitrary function 
when the even and odd parts are defined as 


82x) = 5 Suv) inno], Seo = 5 Bno- uve). 

(7.64) 
Clearly, the even and odd parts are the symmetric and antisymmetric parts of the quantum 
spectral density, respectively. In what follows, we apply the partitioned form of quantum 
spectral density to two well-known quantum systems, a two-level system and a harmonic 
oscillator. We find that the even part of Syyn (œ) contains the same information as the 
classical spectral density, whereas its odd part is related to damping of the quantum device 
[395, 394]. Indeed, the origin of the odd part can be traced back to the noncommutative 
nature of the quantum-noise operator: [N (A), V(’)] 4 0. 


7.2.1 A Two-Level System Coupled to a Noise Source 


We consider an atom with two energy states such that its ground state |g) and excited state 
|e) have an energy difference of Ee — Eg = hweg, where weg is the transition frequency. 
Interaction of the atom with a noise source is included through the total Hamiltonian 


Ar, = Heg + Hr, (7.65) 


where Heg is the Hamiltonian of the two-level atom and H; accounts for its interaction with 
the noise source. As in Section 4.1.4, we use the Pauli spin matrices to write Heg in the 
form [251] 


1 1 
Heg = shédeg( le) (el — 18) (gl) = 5 hoeeos, (7.66) 
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where energy is taken to be zero in the middle of two energy levels. Transitions between 
the two energy states are included through the raising and lowering operators introduced 
in Aside 4.3 and defined as 


o4 = 01 + ioo, o = 0] — 102. (7.67) 
These operators change the state of the atom as 
ople =0, olg) =le), o-1g)=0, a |e)=le). (7.68) 
Furthermore, they satisfy the relations [252]: 


[03,04] = 204, [o4,0-]=03, o2=07=0, ol =o. (7.69) 


The noise source is included through the operator F(t). We work in the interaction picture 
and write the interaction Hamiltonian as [395, 394] 


AO = mF ()o1, (7.70) 


where a tilde denotes an operator in the interaction picture. This Hamiltonian is responsible 
for energy exchange between the two-level system and the noise source. We assume that the 
atom interacts weakly with the noise source, and first-order perturbation theory is adequate 
to treat the interaction. In this situation, the wave function evolves with time as 


; ft 
|7(t)) = [¥:(0)) — if H(t) |¥7(0)) dr, (7.71) 


where |7(0)) is the wave function of the system at t = 0. 
Consider first the case of a two-level atom initially in its ground state. The probability 
amplitude for finding the system in its excited state at time f is given by 


t 


i z 
delt) = (e|W7(1) = -f (e|Hr(T)l8) dt, (7.72) 
0 
where we used (e|g) = 0 for the two orthogonal states. Using H(t) = mE (to; in the 
preceding equation together with (e|o)(t)|g) = exp(iwegT), we obtain 


i t 
æelt) = = ] Êe dr. (7.73) 


The probability of finding the two-level system in the excited state is thus given by 


2 t t 
Pelt) = (lael?) = 7 f [ Pere) expliag(—Wldndy. C 


The preceding expression can be simplified for a stationary noise process. Using 
Ti = T) + T, it can be written as 


nt ft l n2t 
Pelt) = 2 f Percent de = H Spr (oe. (7.75) 
0 


where we used definition of the spectral density after extending the limits of integration to 
infinity. This result is valid after a sufficiently long duration of time in the limit of weak 
coupling between the two-level atom and the noise source. When the system is initially in 
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the excited state, a similar analysis can be carried out to show that the probability of finding 
the system in the ground state is given by 


2 
= nt 


Pe(t) = (læg?) = 5 SFr} oeg). (7.76) 
The time derivative of these probabilities gives us the transition rates between the ground 
and the excited states. Using ¢ for the upward transitions (|g) — |e)) and | for the 


downward transition (|e) —> |g)), these transition rates are given by 


Nn n? 
r4 = z2 SFF Weg)» r= z2 FF (—@eg)- (7.77) 


This result shows that positive frequencies correspond to absorption of energy, while neg- 
ative frequencies correspond to emission of energy. If the two-level system is in thermal 
equilibrium with the noise source at some temperature T, the transition rates of the two- 
level system must satisfy the detailed-balance condition: r4/ T, = exp(—h@eg/kgT). This 
in turn implies that the quantum spectral density must satisfy the relation 


SFF(Oeg) = EXPp(—h@eg/kpT)Srp(—Meg)- (7.78) 


7.2.2 A Harmonic Oscillator Coupled to a Noise Source 


Another useful example of quantum noise is provided by a harmonic oscillator coupled to 
a noise source [396, 397]. As before, we write the Hamiltonian in the form 


Ano = Ho + 1, (7.79) 


where Ho = ha,(a@'a + 5) is the free part (see Section 4.3.2) that has the eigenstates, 
Ho |n) = En |n), with energies Ey = hay(n + D; here wọ is the oscillation frequency. The 
interaction part of the Hamiltonian can be written as [395] 


FO = mRÊOI = m [â + | FO, (1.80) 


where nn is the coupling constant, x9 = ./f/(2mp,@,) is the uncertainty in the position of 
the harmonic oscillator of mass myo in its ground state, and F is the operator responsible 
for the noise. 

Coupling of the harmonic oscillator to the noise source causes transitions among its 
energy levels such that the state |n) changes to |n + 1) or |n — 1). As all transitions are 
between two neighboring energy levels separated by wo, we can use the transition rates 
found in Section 7.2.1 even for a harmonic oscillator. Thus, the rate for increasing the 
number of quanta in the oscillator by one, taking the state |n) to |n + 1), is given by 


2 
Prone = 2 [0 + Da] Spr(@o) = (n+ DIY. (7.81) 
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Similarly, the rate for taking the state |n) to |n — 1) is given by 


2 
Pron = TAOS rroo) = Ay, (7.82) 


Using the preceding two rates, we can construct the following master equation for the 
probability of finding the harmonic oscillator in the state |n): 


“Pat = [MPa O + a+ DEY Pr O] [AP PaO + Qt DPO (7.83) 
The first two terms describe the transitions into the state |n) from the states |n — 1) and 
|n + 1) that increase P,,(t). The last two terms describe transitions out of the state |n) to the 
states |n — 1), and |n + 1) that decrease P„ (t). 

The average energy of the oscillator at time ¢ can be written as (E(t) = Xo hao 
(n+ 5)Pn(t). By differentiating this expression one can show that the average energy 
changes with time as [395] 


d (E 
s Ps — y (E®) , (7.84) 

dt 

where P, and y are defined as 
n {1 
n 
P; = mn {5 [Srr(@eg) F Sre(—0)]| > (7.85) 
n2 

y= 72 [Srr(—@eg) — SFF(@eg)] . (7.86) 


Physically, P; denotes the power supplied to the harmonic oscillator by the noise source and 
y is the rate at which energy of the harmonic oscillator is dissipated. Notice that P, depends 
on the symmetrized part of the spectral density (the even part), whereas y depends on its 
odd part. Clearly, the decay term has its origin in the asymmetric nature of the quantum 
spectral density with respect to the positive and negative frequencies. Positive frequencies 
represent absorption of energy by the oscillator, while negative frequencies denote loss of 
energy through emission. The difference between these two is the net flow of energy from 
the oscillator to the noise source. 

Another feature that differentiates quantum spectral density from its classical counter- 
part is its finite value at zero absolute temperature (0 K). This feature has its origin in the 
zero-point energy in the ground state of the oscillator. It can be viewed as the energy 
that remains in the harmonic oscillator even when all motion has ceased. As we have 
seen, an harmonic oscillator contains shag energy in this state. The origin of zero-point 
energy is attributed to Heisenberg’s uncertainty principle associated with two conjugate 
variables (position and momentum in this case) (see Aside 2.11). If its position remains 
uncertain in the ground state, a harmonic oscillator should be moving and must have a 
finite energy. 

The concept of zero-point energy can be extended to electromagnetic waves after recall- 
ing from Section 2.3 that each electromagnetic mode is equivalent to a harmonic oscillator 
and is thus subject to the same uncertainty principle. Thus, an electromagnetic mode 
oscillating at frequency wm must have a minimum energy of sham on average. Even 
though this is a tiny amount of energy for that mode, the total zero-point energy can 
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be considerable when an enormous number of modes oscillating at different frequen- 
cies occupy the vacuum. Casimir showed in 1948 that one consequence of the zero-point 
energy is the presence of an attractive force between two uncharged, conducting par- 
allel plates [398]. Callen and Welton [175] found in 1951 the quantum version of the 
fluctuation-dissipation theorem, whose classical version (see Section 3.3) has been known 
since 1928 [175]. This theorem shows that when a device dissipates energy in an irre- 
versible manner, fluctuations in the coupled reservoir are unavoidable. As a consequence, 
it is not possible to separate fluctuations from dissipations. This theorem is applicable 
to a resistor, which exhibits current fluctuations as it dissipates energy as heat. We dis- 
cuss the quantum spectral density of a resistor in Aside 7.3 using a transmission-line 
model. 


Aside7.3 Quantum Noise of a Resistor The impedance of a semi-infinite, lossless, 
transmission line is a purely real quantity that is also frequency independent. Energy 
supplied by a source at one end of such a line is transmitted through the line with- 
out being dissipated along the line. However, because any launched energy does not 
return, it can be considered to be lost. For this reason, a lossless transmission line 
can be used to model an ideal resistor. If the transmission line has an inductance 
L per unit length and capacitance C per unit length, then the resistance is given 
by R = J/L/C. We calculate the quantum spectral density of a resistor using this 
approach [263, 394]. The main reason behind our choice is that a transmission line 
is also equivalent to a large collection of harmonic oscillators that can be readily 
quantized. 


As seen in Figure 7.1, a transmission line can be divided into small sections of length Az 
such that each section contains an LC circuit. As Az — 0, the lumped model turns into 
a distributed model. Consider the nth section at a distance z = z, from the origin. Its LC 
circuit has lumped capacitance CAz and lumped inductance LAz. Suppose this section 
has a loop current J,(t,Z,). Associated with this current is the loop charge, Qp(t, zn) = 
h I(t’, Zn) dt’. We use this charge as the generalized coordinate and write the Lagrangian of 
the system as a sum over an infinite number of LC sections. Noting that the Lagrangian for 
an electrical system represents the difference of energies stored in inductors and capacitors, 
we obtain 


LAz 
V(t) 1 (tz,) (cx = 
0 z Az z 


Transmission line model of a resistor containing a large number of cascaded LC sections. The nth infinitesimal LC 
section of length Az is shown with a loop current /n(t, Zn). 
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LAz (3 1 
L= 2 É (Zora) _ zca; Oni Zn+1) — oszt 


alra "A (Qnt1G Znp1) — On(t zn)" 
E 2 E (Fon. cn) 2C ( Az ) Az. (7.87) 


Taking the limit Az — 0 and writing the sum as an integral, we obtain 


= f@ 2 [aN 1.70%" 
oa BG TOC O25) 


where Q stands for Q,,(t, z) in the limit Az — 0. 


We can now use the standard variational method. Applying the Euler-Lagrange equation 
to the preceding Lagrangian, we obtain the wave equation 
o 180 0, (7.89) 
af v OF 
where the velocity is defined as vp = ae To find the modes supported by the transmission 
line, we first assume that its length / is finite, and let l —> oo later to mimic a semi-infinite 
transmission line. For our finite-length transmission line, the boundary conditions at the 
two ends are Q(t,0) = 0, and Q(t,/) = O because no currents can flow beyond these 
points. The general solution of the wave equation satisfying these boundary conditions is 
found to be 


lee) 7 , 
OGD =Y M), m= ie sin(nzz/0), (7.90) 


n=1 
where u,,(z) is the nth mode of the transmission line assumed to be normalized such that 
i Um(Z)Un(Z) dz = Smn- 
When we substitute the preceding normal-mode expansion into the original Lagrangian 
and integrate over the length of the transmission line, we obtain the reduced Lagrangian 


lee) 2 ie) 2 
L (d@n l 2 92 L dOn 22 
L=). F( 7 ) -a= 2 i arg. |, (7.91) 


n=1 =! 


where K, = nz/l and w, = Kn/vo. This form is identical to the Lagrangian of a set of 
harmonic oscillators with oscillation frequencies w,. Each of these harmonic oscillators 
can be quantized using the appropriate creation (a) and annihilation (a,) operators (see 
Section 2.3). 


One can use the preceding analysis to calculate the noise voltage V(t) at z = 0. Once the 
correlation function [yy(t) is found, its Fourier transform provides the quantum spectral 
density (two-sided) in the form 


2haR 


= CE) , (7.92) 


Syy(@) = 
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For frequencies such that iw « kgT, we can use the approximation e* ~ 1 — x to find 
Syv(@) = 2kgTR, the same result we obtained earlier for the two-sided spectral density 
of thermal noise. The case corresponds to the classical limit in which electronic devices 
operate. In the opposite limit, ñw >> kgT, known as the quantum limit, we can neglect the 
exponential in the denominator to obtain Syy(w) = 2ha@R. 


7.3 Quantum Langevin Equations 
a a ee a a 


As discussed in Section 2.2.5, time evolution of a quantum system can be studied using 
three different approaches known as the Schrödinger picture, the Heisenberg picture, and 
the interaction picture. In the Schrédinger picture, the state of a system evolves with time 
but operators remain unchanged. In the Heisenberg picture, operators evolve with time but 
the quantum state remains unchanged. The interaction picture is an intermediate situation 
where both states and operators are allowed to vary with time. The Heisenberg picture is 
most closely related to classical mechanics, where the dynamical variables (which become 
operators in quantum mechanics) evolve in time. For dissipative systems, the evolution 
of an operator is governed by a quantum Langevin equation, which contains random force 
terms representing the interaction of a quantum system with a surrounding reservoir. These 
force terms are necessary to preserve the commutation relations involved in any quantum 
description. 

As an example, consider the position operator q and the momentum operator p. These 
operators become time dependent in the Heisenberg picture: g(t) and p(t), respectively. 
Their commutator initially at time tf = 0 satisfies the relation [q(0), p(0)] = ih. As this 
relation does not change with time, it is necessary that the condition [g(f), p(t)] = ih is 
satisfied at all times. In the Heisenberg picture, any operator evolves with time as (see 
Section 2.2.5) 


ee: 
ihz OW = OW + (OW. Hl, (7.93) 


where the Hamiltonian H may also vary with time. This equation often bears a close resem- 
blance to the corresponding classical equation, which can be of benefit in a theoretical 
treatment. However, the Heisenberg equation of motion is generally nonlinear and may be 
harder to solve in practice. 


7.3.1 Quantum Theory of a Laser 


As an example of quantum Langevin equations, we consider a simple model of a laser 
(see Fig. 7.2) in which two-level atoms interact with the radiation field of a single excited 
mode of a resonant cavity [399, 400, 401, 402]. We use the Jaynes-Cummings Hamiltonian 
discussed in Section 4.1.4 for the two-level atoms. Each atom has a ground state |g) with 
energy Eg and an excited state |e) with energy Ee such that Ee — Eg = ħweg. To simplify 
the analysis, we assume that the frequency wg of a specific electromagnetic mode of the 
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laser cavity 


e> mn 
oF A 
e e e. e e e e e. e e e e e e . e 


|g> 


Schematic of a laser cavity where entering atoms (represented by dots) are modeled as a two-level system. These 
atoms are coupled to an electromagnetic mode of the laser cavity. 


laser cavity coincides with weg. This ensures that only one mode couples strongly to the 
two-level atoms. 

The laser model is based on the injection of atoms in their excited states into the laser’s 
cavity at random times. These atoms interact with the radiation mode and amplify it by 
emitting photons through stimulated emission. We assume that the probability distribution 
of the arrival time of atoms does not depend on time (a constant pumping rate). We also 
assume that cavity losses are sufficiently small that we can adopt the mean-field approxi- 
mation for the radiation field and discard its spatial variations. This field is assumed to be 
in the form of a plane wave with frequency œg and the wave number k = w/c. As shown 
in Ref. [400], such a model has a single statistical parameter in the range 0 < p < 1, the 
case of Poissonian statistics corresponding to p = 0. 

The Hamiltonian of the system can be written in the rotating-wave approximation as 
(see Section 4.1.4) 


H = heyday + Y (Eoi > E,o}) +nG Y ut- 4) (ajo! + olay) . (7.94) 
j j 


where the index j accounts for a specific excited atom entering the cavity at time t;. The step 
function u(t — tj) is used to represent the initiation of coupling between the field and atom 
at time t;. The operators a} and a; are the creation and annihilation operators of the single 
lasing mode. The operators, ol = (Je) (e|} and of = (|g) (g|)/, are the projection operators 
for the kth atom. The operator of = (|g) (e|)’ is the spin-flip operator with the properties 
dije = | gy and o” | gy! = |e). The coupling constant G represents how strongly the 


atoms interact with the cavity mode and can be written as 
G = (2ħeoor V)! Weg Meg, (7.95) 


where Heg is the magnitude of the atomic dipole moment and V is the cavity volume. 
Using the preceding Hamiltonian in the Heisenberg equation of motion given in Eq. 
(7.93), we obtain the following four equations for the atomic and field operators: 


dax 


= AO- iG X u(t = toio, (7.96) 
j 
di O o, ace 
ri —iwko! (t) + iGu(t — tiol — o} (tât), (7.97) 
dol oa iia 
= iGu(t — tO A — o "AAA, (7.98) 


dt 
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do} 

dt 
These equations do not include the field-decay term resulting from loss of photons at the 
output mirror of the cavity and the atomic decay terms related to spontaneous emission and 
the interaction of atoms with an external reservoir. 

The reasoning behind introducing the decay terms in the preceding set of equations 
is analogous to the Weisskopf—Wigner theory discussed in Section 4.1.3. As we have 
seen before, a Langevin force must also be included with each decay term. The resulting 
equations are known as the quantum Langevin equations: 


= —iGult — 4)[a, (No!) — o "(H)ax(0)]. (7.99) 


“ = —(io+ 5 auto iG 3 ult — tA + Fr(0), (7.100) 
A = (ig + Yeo") + Gult — tiA) — NO + fO, (1.101) 
- = (ye + Yo + Gult — MOIO — oH Ma) +0, Gis 
des = — yo) (t) + VoiD — iGult — TOIA — o VAD] +O. 7.103) 


Here y is the cavity’s damping rate, yp and yg are the decay rates for the excited and ground 
states, respectively, y; is the spontaneous decay rate, and yeg is the decay rate of the atomic 
polarization. 

As this laser model considers only a single cavity mode, it imposes a restriction on y: 
the bandwidth of the cavity mode (related to y) must be much smaller than the cavity’s 
mode spacing given by dw = (27c)/L for a cavity of length L. For purely radiative decay, 
the four atomic decay rates are related to each other as 2g = (Ye + Yg + y4). This relation- 
ship can be established by considering the Liouville equation of the density operator (see 
Section 3.1.3). However, this relation holds only for purely radiative decay and does not 
account for the plethora of collisions that affect the relative phase between the excited and 
ground states, without significantly influencing their populations. When such collisions are 
included, we obtain a more realistic relation: 2yeg > Ye + V4 + Yg- 

Equations (7.100) through (7.103) contain four Langevin forces denoted as F(t), f (t), 
fi (t), and AA). They vanish on average and thus do not contribute to the averaged equations 
of motion. However, their correlation functions are finite. Assuming all four random forces 
are Gaussian processes (see Section 7.1.5), a correlation function is all we need to fully 
specify statistical properties of each Langevin-noise operator. However, we need to be 
careful because noise operators are noncommutative, and the order of the operators in an 
expression matters. In practice, we need to adopt some meaningful ordering to make the 
resulting expressions unique. The conventional approach is to use the normal ordering 
through Wick’s theorem (see Aside 2.14). 

The correlation functions of the Langevin force F(t) have the form 


(FiF) = y (nm 80-1), (1.104) 
(FOF) = yin) + 180 — t’), (7.105) 
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(FDF) = 0, (FEOF) = 0, (7.106) 


where (n); is the average number of thermal photons in the laser cavity at frequency w,. 
The presence of delta functions in the preceding expression indicates that the underlying 
random process is Markovian with no past memory. 

To simplify the form of correlation function for the atomic Langevin forces, we assume 
that the reservoir is at zero absolute temperature (T = 0 K) so that the average number of 
thermal photons in the cavity is zero ((),, = 0). With this assumption, the nonvanishing 
correlation functions of the atomic Langevin forces are given by 


FISIO) = (Ve + V) (ok) 8t — r’), (1.107) 
(OAC) = ve (iO) 8 — L) + yi (iO) 80 = t’), (7.108) 
FOO = =y (iD) t = t), (7.109) 
(OO) = =y (0) 8t — t’), (1.110) 
CIOE = ve (0%) 0 — t’), (1.111) 
(FL (OFLC) = Ve (O) 8 — t + y (0) et r’), (1.112) 
EOSTO) = Veg — Ve) (O) 80 — 1) + yi (al) l — t’), (1.113) 
IOSLO) = Yeg — Ve — V2) (i0) 8t — 1). (1.114) 


As discussed in Section 7.1.6, the correlation functions of the Langevin forces are gener- 
ally written in the form (F (AF (t )) = 2Dyuô(t— t), where Dyu is a diffusion coefficient. 
These coefficients can be calculated using the generalized Einstein relation derived in 
Section 7.1.6. As an example, let us calculate Dee(t) using Eq. (7.57): 


d , l do, 

2Dee(t) = — (a; 10 (4 oii) - (Geto) olt) + 7 (GOME (7.115) 

Recalling that [o{(1)]? = of(#) and using the relation (dod /dt) = —(ye + y))od(t) [obtained 
from Eq. (7.102)], we obtain 

2Deel(t) = (Ye + 2) (iO), (7.116) 


which agrees with the result in Eq. (7.107). 


7.3.2 Macroscopic Atomic Variables 


The Langevin equations (7.100) and (7.101) contain rapidly oscillating terms at the cavity 
mode frequency wg. It is useful to introduce slowly varying operators Ag(f) and o; (t) as 


a(t) = AO expl iot), IA = ofA) expl iot). (7.117) 


The Langevin equations for the two slowly varying operators remain the same as those 
for a(t) and o/(t), with the only difference that the terms containing w disappear. Let us 
point out that the correlation functions do not change in the rotating frame because fast 
variations cancel out during the averaging procedure. 
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At this point, we introduce three macroscopic atomic operators as [400], 


M(t) =— 2 u(t — t))o,/(t), (7.118) 
j 
Ne(t) =X u(t — toi), (7.119) 
j 
N) =Y ult — Hoh (0). (7.120) 


j 
The operator M(t) corresponds to the atomic polarization, while the operators Ne(t) and 
N,(t) represent the population of atoms in the excited and ground states, respectively. 
It is important to note that when calculating the average values or correlation functions 
associated with these macroscopic operators, one has to perform not only the quantum 
mechanical average but also the classical average over the random arrival time ¢; of the 
atoms injected into the cavity. 
We can now write Eq. (7.100) for the field as 


d 
“AKO = — LAD) + GM(t) + FO. (1.121) 
To find an equation for the operator Ne(t), we differentiate Eq. (7.119) to obtain 


d ; d , 
gO = 3 A(t — tj) (tj) + dX u(t — ti) 7, 700)» (7.122) 


where we use the relation d0 /dt = 6(t) and replace ol(t) with its value at the time t; because 
of the presence of the delta function. Substituting the time derivative from Eq. (7.101), we 
obtain 


© Nett) z 2 8(t — toit) + 3 u(t — fO 


= (Ve + YLNe(t) — GA OMO — GM" OAKA). (7.123) 


To proceed further, we relate the sum in the first term to the mean pumping rate R. As 
all injected atoms into the cavity are initially in the excited state, we have the relation 
(o2(t))) = |. Thus, 


T 
(> lt — olan) = (> 8(t — D) =} =J S(t — tj)dt = R, (7.124) 
E 7 5 0 
Ji vi J 


where R = N;,/T is the mean pumping rate when N; atoms enter the cavity over a 
sufficiently long duration T. In terms of R, we write the first term in Eq. (7.123) as 
S st — poit) =R+ | Yst — Holy) — R|, (1.125) 
j j 


where the term inside the square brackets acts as a pumping-noise term that vanishes on 
average. Substituting this expression in Eq. (7.123), we obtain 


d + 
ae = R — (Ye + ¥,)Ne(t) — GA, (M(t) — GM" (DAR + Felt), (7.126) 
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where the new Langevin stochastic force is defined as 


F(t) = X u(t — Hf) + Y E tolt) — R. (7.127) 
j j 


J 


The first term in F(t) accounts for population fluctuations, while the second term 
accounts for pump fluctuations. It is easy to show that (Fe(t)) = 0. We need to calculate 
the correlation function of this force. Using the relation in Eq. (7.124) and noting that pop- 
ulation fluctuations and pump fluctuations are uncorrelated, one can write this correlation 
function as 


(FDF) = Clt, t) + Ct, t) — R’, (7.128) 
where 


Cit) = (l Lue- oZ e = i) a) (7.129) 


Ji J 
Ctr) = (l Loe- mota] Ze - wb) / (7.130) 
Ji J2 


To evaluate C1 (t, t’), we note that only the terms with tj, = tj, contribute the double sum. 
As aresult, 


Cit, i) = (x u(t — tj) noten), (7.131) 


j 
Substituting for (fi (nf (t')) and simplifying, we obtain 
Cit, t) = (Ye + Ye) (Ne(t)) 5(t — t), (7.132) 


where we use Eqs. (7.107) and (7.119). To evaluate C2(t, 1’), we need to consider the 
injection statistics of atoms following Ref. [400]. The result is found to be 


C(t, t) =R +(1— pR — t’), (7.133) 


where the parameter p lies in the range 0 < p < | depending on the injection statistics. 
We use the same method for N,(t). Equations for N,(t) and M(t) are found to be 


d + s 
ae = —YV Ng (t) + yNe(t) + GA} ()M(a) + GM" (t)A(t) + F(t), (7.134) 


d 
gO = —YegM(t) + G[Ne(t) — Ng (OJA) + Fu(?), (7.135) 
where the new Langevin forces are given by 
Fp) = X ut = fO + Y 8¢- Hol), (7.136) 
j j 
Fy(@) = —i 5 u(t — tfi (t) — oe S(t — tjo? (t;). (7.137) 
j j 


These forces vanish on average and their correlation functions can be calculated using the 
procedure described for calculating (F.(t)F.(t’)). 
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The following list summarizes the correlation functions for the three Langevin forces 
associated with three macroscopic atomic parameters: 


(Fe(t)Fe(t)) = (ve + v6) (Ne) 8E — L) + ROA — pòl — 2), (7.138) 
(Fe(Q\F g(t’) = Ye (Ne(D) C — P) + yf (Ne(O)) 80 — t’), (7.139) 
(Fe()Fg(t')) = =y (Ned) 8(¢ = 1), (7.140) 
(Fe(t)Fu(t)) = Ye (MO) 8(t — 2), (7.141) 
(Fu()Fe(t)) = Ve + Vi) (MO) 8l — t’), (7.142) 
(FMF) = -y (MO) 8G = t’), (7.143) 
(Fy()Fu(t)) = (yeg — Ye — v6) (Ne(t)) 8l — r) + RSG — t’), (7.144) 
(FUF) = Yeg — Ve) (NCO) 8(t — 1) + vi (Ne) A(t = 2). (7.145) 


Aside 7.4 describes an intuitive way to check the accuracy of these correlation functions. 

We have obtained in this section the quantum Langevin equations for the field operator 
Ax and three macroscopic atomic operators Ne(t), Ne(t), and M(t). These four equations 
describe fully the laser dynamics and fluctuations associated with them. Their deriva- 
tion is instructive to the extent it shows underlying details that can be used to construct 
Langevin force terms for other quantum devices. The next task involves solving these 
operator equations using a computer and is the topic of the following section. 


Aside7.4 A Fast Way to Evaluate Correlation Strengths of Langevin Forces 

The Langevin force terms appearing in a quantum-noise analysis can often be calculated 
using simplified shot noise models [403, 404, 405, 392, 393, 406]. Even though such 
models are not always valid, they provide a valuable tool for checking the accuracy of 
correlation functions of the Langevin forces. We discuss such a shot-noise model for the 
Langevin forces associated with the quantum theory of lasers. 

As we saw in Section 7.1.4, shot noise results from the discrete nature of particles (photons, 
electrons, etc.) flowing in and out of their reservoirs. The associated spectral density of shot 
noise is constant (frequency independent) and is proportional to the average rate of particle 
flow. In the Langevin formalism, each flow of particles from a reservoir contributes shot 
noise to the noise associated with that reservoir. 

Consider for example two reservoirs labeled r and s, and let F,(t) and F;(t) be the 
Langevin forces associated with them. To calculate the strength of the correlation function 
(F()F;(t)), we simply add the flow rates of particles arriving at and leaving the reservoir 
r. However, in the case of the correlation function (F,(t)F;(t)} representing the exchange 
of particles between the reservoirs r ands, we not only add the flow rates between the 
reservoirs but also multiply the final result by —1. As most of the flow rates can be found 
by simple inspection, it is possible to quickly estimate the correlation strengths of the 
Langevin forces using this method. 

For example, consider the correlation function (F'g(t)F g(t’ )) given in Eq. (7.139). As there 
are two particle flows associated with the ground state, namely the decay rate yg (Ne(t)) and 
the spontaneous emission rate from the excited state, yj (Ne(1)), the correlation strength 
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should be the sum of these values as given in Eq. (7.139). However, in the case of the 
correlation function (F,(t)F, rus )) given in Eq. (7.140), the only rate that corresponds to an 
exchange of particles between the two states is the spontaneous-emission rate y; (Ne(t)). 
Thus the correlation strength in this case is the negative of the above rate, as seen in 
Eq. (7.140). 


7.3.3 c-Number Langevin Equations 


The Langevin operator equations often cannot be solved analytically, thus requiring the 
use of numerical methods. However, numerical methods cannot deal with operators. It is 
thus necessary to convert the operator equations into their equivalent c-number form. It is 
not obvious how to do that when two operators do not commute. The solution is to define 
the order of operators uniquely so that a direct mapping between an operator product and 
its equivalent c-number representation can be established. It is common to employ normal 
ordering in which the Hermitian conjugate of an operator precedes it. Following this rule, 
we adopt the following order for the atomic and field operators appearing in the quantum 
Langevin equations: Al (0), MÝ O, Ne(t), Nz(t), M(t), Ax(t). However, fixing of the order of 
operators may require redefinition of some correlation relations. The resulting c-number 
equations are only accurate up to second-order moments, which is not a real restriction 
because Langevin forces are specified fully through their first two moments. 

To obtain the c-number equations, we follow these steps. (1) Put the operator equations 
in normal order and rearrange the terms to match the operator order indicated above. (2) 
Replace the operators with the corresponding c numbers using the mapping 


A(t) > A), MO > MO, Ne) > Ne, Ne > Ngt). (7.146) 


(3) Establish the correct correlation functions by matching the first- and second-order 
moments with those obtained earlier for the corresponding operators. 

The first two steps produce the following c-number Langevin equations for the four 
operators: 


Tao S LAD +GM(t) + F; (7.147) 

Smo = —VegM(t) + GINe(t) — Ng(1A® + Fu, (7.148) 

SNe = R — (Ye + VƏN) — GA OMA — GM* (NAO) + Felt), (7.149) 
No = =Y NÐ + VNA + GA OMO + GM*DAO + Fe). (7.150) 


Here the random processes F(t) with k = f,M, e, or g are the c-number Langevin forces 
with the properties 


(Fx) = 0, (FDF) = 2DudS(t — 1), (7.151) 


where Dy are the diffusion coefficients. 
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For the third step, we need to match the second-order moments for each operator and 
its c-number and find the correct diffusion coefficient. We illustrate the steps involved 
by matching the correlation functions for the operator M(t) and M(t). In the case of the 
operator M, the correlation function (M TOM (t)) is related to dM /dt as 


a MÏ (OM(Ð) = aM) say + mig’ 
rm ) = ( F7 yar 4 A 


Substituting the expression for dM /dt, we obtain 


i (7.152) 


d 4 a 
7 (MİOMA) = —2yeg (MİOMA) + (MOFU) + (FOMO) 
G (MÍO [Ne(t) — NA] AA) + G (AL [Ne(t) — Ne(t)] MO) - (7.153) 


Using the generalized Einstein relation (see Section 7.1.6), (MF u(t) + (FMM) = 
2Dmm, and a similar relation for MÏ (t), we can write the preceding equation in the form 


d P 
J, M OMO) = — Yeg (MMO) + G (MIO [Ne — Ne] Ae) + 
+G (ALO [Ne(t) — Ng(t)] M(t)) + 2Dytmu, (7.154) 
where we used [M(t), Ne(t)—Ne(t)] = 2M(t). The same approach is used for the correlation 


function (M(t)M() to obtain 


d 
P7 (M(t)M(t)) = —2Yeg (M(t)M(t)) + 2G ([Ne(t) — No(t)| M(Ax(A) + 
+2G (M(t)A;(t)) + 2Dym. (7.155) 


The corresponding c-number equations for these two correlation functions are 


d 
7 (M*(NM(t)) = — 2yeg (M*(NM(D) + G (M*O [NO — NO] AO) 
+ G (A*() [NO -N O] MO) + 2Du*m, (7.156) 


d 
q MOMO) = —2yeg (MOM) + 29 [ND — Ne] MA) + 2Dum. 
(7.157) 
By matching the equations for each correlation function, we find the relations 


Dumm = Dyimu: Dum = Dmm + G (MA) . (7.158) 


Carrying out this procedure for all operators, diffusion coefficients for all c-number 
Langevin equations are found to be 


2Dee = (Ye + V4) (Ne) + RU = p) — G (M MAO + AOMA), 7.159) 


2Deg = Ye NaO) + 72 (Ne) — G (MŽ OAO + A OMO), (7.160) 
2Deg = =Y} (Ne) + G (MDAA + A OMO), (7.161) 
2Dumm = (2Yeg — Ye — Vg) Ne(Ð) + R, (7.162) 


2Dum = 28 (MOAN),  2Dgm = Yg (MCD) . (7.163) 
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At this point, we can use a numerical technique for solving the c-number Langevin equa- 
tions on a computer and use the results for characterizing the noise dynamics of the laser 
model. 


7.4 Noise Spectra 
a a 


The spectral density of a noise process can be viewed as its “fingerprint.” In principle, the 
spectral density contains the same information as temporal variations of the noise or its 
correlation function, but it is much more useful in practice. This is because different noise 
sources produce easily identifiable signatures in their noise spectral density even when 
their noise time traces appear indistinguishable on inspection. Thus, noise spectra play an 
important role in quantifying the performance of a nanoscale quantum device. 


7.4.1 Langevin Formalism 


In most quantum devices, the SNR is high enough that noise can be viewed as a perturba- 
tion (see Aside 7.1 for a discussion of SNR). As a result, the drift term D, appearing in 
the Langevin formalism can be considered a linear function of the system operators. This 
allows us to write the drift term in the Langevin equation appearing in Section 7.1.6 as 
Dy = —LynAn and write this equation in the form 


d 
qe == 2 LunAn (t) + Fu, (7.164) 


where Ly, can be thought of as scalar elements of the matrix L. It is useful to recast this 
equation in a matrix form by introducing a column matrix A(t) such that: 


A(t) 


A(t) = AT = MO, Au] (7.165) 


Ay (t) |’ 


where the superscript T denotes the transpose operation. The resulting matrix equation is 
d 
Wao = —LA(t) + F(t). (7.166) 
Taking the ensemble average and recalling that (F(t)) = 0, we find 


d 
P7 (A(t) = —L (A(t)). (7.167) 
It follows that the average values of A, will decay to zero as time increases, for any u, if 
all eigenvalues of L are positive. 
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We proceed to calculate the spectral density, defined as the Fourier transform of the cor- 
relation function (A(t)A’ (0)). For this purpose, we invoke the quantum regression theorem 
(see Aside 7.2) and obtain 


< (A(1)A"(0)) = —L (A()AT (0), (7.168) 


£ (ACATA) = —L (AAT O), (7.169) 


where we have assumed all noise processes to be stationary. Being first-order linear 
differential equations, they can be easily solved to get 
(A()A™(0)) = exp(—Lt) (A(0)AT(0)) , (7.170) 
(A(0)A7(t)) = (A(0)A?(0)) exp(—L#). (7.171) 


Invoking the Wiener—Khinchin theorem [385], the spectral-density matrix S,(f) is 
given by 


Sa(@) = f : (A(t)A7 (0)) e dt. (1.172) 


The time origin (t = 0) is chosen such that the system has reached its stationary state long 
before that time. The location time origin does not matter in that situation. 

In the stationary state of a system, physical parameters such as mean and the variance 
do not change over time. We exploit this feature and break up the Fourier integral into two 
parts ranging between (—oo, 0] and [0, +00). Changing ¢ to —t in the first part, we can 
write the spectral matrix in the form 


S4(@) = f i (A(O)A!(t)) ec! dt + l > (AAT (0)) e dt, (1.173) 
0 0 


where we have used the relation A()AT(0) = A(O)A7(—D). Substituting the results from 
Eqs. (7.170) and (7.171), the integrals can be performed formally to obtain 


Sa(w) = (A(0)A7(0)) (L7 — iw)! + (L + iw)! (A@)A™(0)) . (7.174) 


This result shows that once we find the matrix (A(0)A‘(0)), it possible to calculate the 
spectral density S,4(f) using simple matrix operations. 

Let us apply the preceding results to the generalized Einstein relation [see Section 7.1.6 
and Eq. (7.57)]: 


d 
2Dyu = i (An DALO) — (An@MDu) — (D(A LO) - (7.175) 


Recalling that D, = —LunAn and that the time-derivative term vanishes in the steady state 
reached at t = 0, we obtain 


Dny = (A; (O)LpAy(0)) + (LEA) (OA, (0) . (7.176) 


When quantum averages are carried out using the density operator, we can use the cyclic 
property of the trace operation to write the preceding equation in a matrix form as 


2D = L(A(0)A"(0)) + (A(0)A7(0)) L’, (1.177) 
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where we pulled the matrix L outside the average, as its elements are c-numbers. By adding 
and subtracting the quantity iw (A(0)A?(0)), we obtain 


2D = (L + iw) (A(0)A"(0)) + (A(O)A7(0)) (L? — iw), (7.178) 


This equation suggests that the matrix D is related to the spectral density given in 
Eq. (7.174). If we multiply the preceding equation on the left by (L + iw)~! and on the 
right by (LT — iw)~!, we obtain the spectral density in the following simple form: 


S4(@) = (L + iw) !2D(L? — iw). (7.179) 


This is remarkable result. It shows that the quantum spectral density of noise can be cal- 
culated if we know the diffusion matrix and the drift-coefficient matrix for the Langevin 
equations governing the dynamics of a quantum device. As a useful example, we apply this 
method to calculate the intensity-noise spectrum and the phase-noise spectrum of a single- 
mode laser. However, before calculating the noise spectra, we first discuss rate equations 
used to model a laser. 


7.4.2 Rate Equations for Lasers 


Rate equations provide a simple way for describing the operation of most lasers. All lasers 
employ a cavity, where the pumping of a suitable gain medium is used to increase the num- 
ber of photons inside the cavity through stimulated emission. For many lasers, we need just 
two rate equations for the variables N, and Ne representing, respectively, the densities of 
photons and gain carriers inside the cavity. In the case of an atomic gain medium, Ne is 
the inversion density representing the population difference Ne — Ng. In the case of semi- 
conductor lasers, No is the density of electrons in the conduction band. The applicability of 
rate equations is limited to problems where the phase of laser radiation plays a minor role. 
They also ignore all spatial inhomogeneities, assuming that N, and Ne represent quantities 
spatially averaged over the cavity volume. In spite of these limitations, rate equations are 
very useful for gaining insight into the operation of a laser. As they capture the underlying 
physics by ensuring conservation of energy, their predictions are often in reasonable agree- 
ment with experiments. The most complete set of single and multimode rate equations 
applicable to lasers can be found in Ref. [407]. 
The rate equations for Ne and N, can be written as [408]: 


dN 


a = Nin — Rar Rsp VgG(Ne, Np)Np; (7.180) 
dN, 1 
“<P = à PvgG(Ne.Np) — — ) Np + PBspRsp, (7.181) 
dt Tp 


where I = V,/V, is the ratio of the volume V, used by the photons and the active vol- 
ume Ve to which the carriers are confined. Various terms on the right side of each equation 
account for the sources through which Ne and Np increase or decrease. Thus, N;n is the 
rate at which carriers are injected into the laser (by pumping them) while Rpr is the non- 
radiative recombination at which carriers disappear without producing a photon. The next 
term Rsp represents the loss of carriers through spontaneous emission. The carriers can also 
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vanish through stimulated emission at a rate RegN, and reappear through absorption at a 
rate ReeNp, where the order of the subscripts denotes the direction of atomic transition. 
Their combined effects are included through the net gain G defined as vG = Reg — Ree, 
where vg is the group velocity of laser radiation inside the cavity. This term acts as the gain 
in the photon rate equation, where the term containing the photon’s lifetime t, accounts 
for the loss of photons from the cavity. The last term represents the increase of photons 
through spontaneous emission. However, only a fraction of the spontaneously emitted pho- 
tons couple to the cavity mode; sp represents this fraction and is called the spontaneous 
emission factor. 
The output power P of the laser is related to the photon density Np as 


P = (0/Tp)(NpVphae), (7.182) 


where no is the efficiency with which photons of energy fiw, leak through a partially trans- 
parent mirror of the cavity. Here we is the frequency of the cavity mode that is in resonance 
with the transition frequency of the gain medium. 

Fluctuations in a laser’s power occur when Nec, Np and Nin fluctuate around their steady- 
state values. They can be included by writing these quantities as 


Ne =Ne+6Ne, Np=Np+6Np, Nin = Nin + ôN, (7.183) 


where a bar over a quantity denotes its steady state value and a 6 in front of it denotes a 
small fluctuation. Because all fluctuations are assumed to be small, we can linearize the 
rate equations by expanding all terms up to the first order in a Taylor series to obtain the 
following set of equations: 


dôN, 

< = Nec Ne + NepôNp + Nin (1.184) 
dôN, 
Hu = NpcdNe + NppôNp, (7.185) 


where the coefficients nj are defined as 


Nec = -zy Re + Rar + VeG(Ne, Np) Np), (7.186) 
c 
0 
Nep = — on Rep + Rar + VgG(Ne, Np)Npl, (7.187) 
p 
f) 1 
Npe = ZT PvgG(Ne, Np) saat Np F PBspRsp > (7.188) 
ONe Tp 
f] 1 
Npp = a | | T YGNe, Np) — — | Np + VBspRsp |- (7.189) 
ONp Tp 


All of these derivatives are evaluated using the steady-state quantities: Ne, Np, and Nin. 


253 


7.4 Noise Spectra 


7.4.3 Relative Intensity Noise of a Laser 


Fluctuations in the output power of a laser are measured through the relative intensity noise 
(RIN), defined as the noise variance normalized to the average-power level. If we write the 
instantaneous output power P(t) of the laser as P(t) = (P(t)) + ôP(t), the RIN is defined as 
2 
N= ae a : (7.190) 
(PO) 
To calculate the RIN , we use the two linearized rate equations given in in Eqs. (7.184) 
and (7.185). However, we must add a Langevin noise term to them representing noise 
sources Fe(t) and Fp(t). The resulting equations are 


dbN- 

dt = NecdNe + NepdNp + bNin + Fe) (7.191) 
dbNp 
aE = NpcdNe + NppdNp + Fo(t). (7.192) 


The correlation functions of Langevin-noise terms can be easily found by following the 
simple strategy outlined in Aside 7.4. The resulting expressions are 


I es 
(Fe) Fe(t’)) = y. [Nin + Rsp + Rar + Reg + Rge)] 8 — r’), (7.193) 
1 2 
(Fy(t)F p(t’) E y2 [Vp Np/ Tp F Ve(Reg T Ree T BspRsp)] ot a t’), (7.194) 
p 
Ve 
(FoF A(t) = (Fe Fp(t’)) = — 7 Reg + Ree + BspRsp)bt — r). (7.195) 
P 


It is easier to solve the preceding two equations in the frequency domain. When we take 
their Fourier transform, the time derivative is replaced with iw. The resulting equations can 
be written in a matrix form as 


—Nec + iw —Nep ) io) 2 E 7.196 
( -me -p + io) (Fino) 7 Fo CO 
The solution is found by inverting the coefficient matrix and is given by 
ian) _ 1l Å- + iw Ncp ) ea (7.197) 
F {dN} D() Npc —Nec + iw) FIFO 


where D(@) = (Nee — i@)(NppP — iw) — NpcNecp- 

We can use these expressions to calculate the spectral densities of N. and Np. However, 
we are interested in the spectral density of power fluctuations ôP(t). Even though the out- 
put power is proportional to Np, its spectral density is not just related to |F {SNp()} 1°, as 
pointed out by Yamamoto and Imoto in their 1986 paper [406]. They found that the power 
spectral density of a laser operating above its threshold is approximately equal to the clas- 
sical shot-noise level found by Fourier transforming (Fp(t)F, pet’ )). However, the spectral 
density of ôN, is Lorentzian in shape, and its variance is equal to the average photon num- 
ber Np, as dictated by its Poisson statistics. The difference stems from the presence of the 
output mirror, which acts as a random selector that divides the internal photon stream into 
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its reflected and transmitted parts. Even though the average transmission of the mirror is 
constant, the selection of photons takes place at random times. As a result, over a short time 
duration, irregularities in the transmission process lead to partition noise, which affects the 
statistics of output power fluctuations. 

We can account for this partition noise at the output mirror by adding a Langevin force 
Fm(t) to the output power fluctuations. The modified expression for the output power 
variation 5P(t) becomes 


SPC) = (no/tp)Vph@egSNp(t) + Fin(t). (7.198) 
The spectral density of P(t) is given by 
1 CO 

Spp(@) = =| (F{SP}(@)F{SP}*(w’)) da’. (7.199) 
T J—oo 


By taking the Fourier transform of Eq. (7.198), we obtain the power spectral density in the 
form 


Spp(@) = [(n0/ Tp) Vphi@eg] LS5n,,5N,() + SF Fy (@)SSN, Fn (O) + SF,,5N,(@)]. (7.200) 


The last two terms result from interference between the two noise sources. 

The correlation function for Fm(t) can be calculated by following the guidelines in 
Aside 7.4. As (y0/ Tp) Np Vp is the rate at which photons escape the output mirror, it can be 
written as 


(Fin(t)Fn(¢)) = (no/tp)NpVp(h@eg)5(t = r), (7.201) 


where the factor (Boeg)? was added because Fm(t) has power units, but the correlation 
strengths are calculated using the photon number Np. It follows that the spectral density of 
Fm is given by 


SF Fim (@) = (no/Tp)NpVp(h@eg)- (7.202) 


The cross-correlation term can be calculated by considering the photon numbers inside 
and outside the cavity and is found to be 


(F (Fmt) = —(0/Tp)Nph@egd(t -t)> SFpFi,(@) = —(0/Tp)Nphweg. (7.203) 


Note also that SF,F„(@) = Sr,,F,(@). We can evaluate Sy, r,,(@) and S,,n,(@) by multiply- 
ing the left or right side of Eq. (7.197) with F{Fin(t)}(@’) and carrying out the integration as 
specified in Eq. (7.199). Note that there is no correlation between F(t) and Fm(t) because 


the two are completely independent processes. The resulting expressions are: 


Ssn,F,(@) = [DOI (nN + io) Srpr,,(), (7.204) 
Sp,6N,(@) = -ID * (@) | Oy + 10) Sky Fp(@)- (7.205) 


At this point we have everything needed to calculate the RIN of a laser operating above 
its threshold. The underlying details of the derivation are useful for learning the intricate 
reasoning required to calculate various correlation functions associated with the operation 
of any quantum device (not just lasers). For example, the RIN is a measure of the perfor- 
mance of a laser that can be measured and validated experimentally. The utility of such 
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a measure is that, even though one may not know the exact origin of all stochastic terms, 
the derived quantities can be measured experimentally and used to validate the underlying 
model. 


7.4.4 Spectral Bandwidth of a Laser 


Even a laser oscillating in a single mode of its cavity has a finite bandwidth, often called 
the laser’s line width. For an intuitive understanding of this line width, we must consider 
the role of spontaneous emission. When a spontaneously emitted photon couples into a 
specific mode of the cavity, the laser’s phase changes suddenly in a random fashion. Thus, 
each such spontaneous-emission event acts as a noise source, which we can model using a 
Langevin force. We stress that events other than spontaneous emission also influence the 
line width of a real laser. The value we calculate using spontaneous emission represents 
the smallest value of the laser’s line width having its origin in the quantum noise. 

We use again the rate-equation model of Section 7.4.2 but add an additional rate equa- 
tion for the phase of the laser’s electric field. We construct this additional equation by 
considering how the refractive index of the laser’s material changes with the carrier den- 
sity Nc. This refractive index is a complex quantity written as n = n, + inj. Its real part 
n, changes the phase of the electric field, while its imaginary part n; is related to the g of 
the medium through the relation g = 2njWeg/c. However, the real and imaginary parts are 
related through the Kramers—Kronig relations (see Aside 3.2). In the literature on semi- 
conductor lasers, it is common to introduce a parameter called the line-width enhancement 
factor [409] to account for this relationship. This parameter is defined as 


d 
gee aN. (7.206) 
dni/dNe 


Using this parameter, small variations in the optical phase satisfy [410] 


dG 
Zago = Hv, Vane + Fold. (7.207) 


where the Langevin noise term F(t) accounts for phase fluctuations induced by spon- 
taneous emission. Note that fluctuations in the laser’s frequency are related to the phase 
through the derivative d(é@) /dt. 

Next, we need to establish various correlation functions of Fy(t) with itself and other 
Langevin forces. As this question has been addressed in Ref. [409], we quote the results 
here: 


BopR a m t’), (7.208) 


(FaFa) = 
(FoOFpt)) =0, (FeFy(t)) = i (Fo (t)5Ne(t’)) = 0. (7.209) 


We use these expressions to find the spectral density of deg using the relation in Eq. 
(7.199) but with the substitution 5P(t) > weg: 


a dG \? 
Sswdw(@) = (Gr) SENSN: + SFF (©), (7.210) 
C 
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where we used the fact that phase fluctuations are not correlated with the carrier-density 
fluctuations. This expression shows that the spectral density of frequency fluctuations is 
enhanced because of carrier fluctuations when œp Æ 0. 

The line width of a laser, 5a), is defined as the full width (at half maximum) of its 
spectral distribution and is related to the preceding spectral density as [411] 


wI = Ssa5o(0) = (1 + oF SF FO), (7.211) 


where we combined the two terms. This relation shows why ay is called the line-width 
enhancement factor. In the case of semiconductor lasers, values of œp can vary from 2 to 
8, depending on the laser’s design. Thus, carrier fluctuations broaden considerably the line 
width of such lasers. In the absence of carrier fluctuations (or a, = 0), a laser’s line width 
acquires its smallest value, dictated by phase fluctuations induced by the coupling of spon- 
taneously emitted photons into the laser mode at random times. This is the fundamental 
limit set by the quantum noise. 


7.5 Squeezed States of Light 


One may ask whether it is possible to reduce the noise of an optical device below the fun- 
damental limit set by the quantum noise. Although the answer is clearly no if one wants 
to reduce the noise at all frequencies, it turns out that the noise level of a device can be 
reduced below the limit set by the quantum noise in a narrow frequency range. This reduc- 
tion is realized through the use of special quantum states known as squeezed states. In 
recent years, the use of squeezed states for noise reduction has been proposed for several 
applications. 

An important application is related to the laser interferometer gravitational-wave obser- 
vatory (LIGO). The LIGO employs a Michelson interferometer with two 4-km-long arms 
to detect gravitational waves creating relative displacements as small as 1072? meter. This 
instrument in 2016 succeeded in detecting gravitational waves, resulting in a Nobel Prize 
for the team. However, its sensitivity is limited by the quantum noise and can be improved 
by using an instrument known as the quantum vacuum squeezer, which reduces the effects 
of vacuum fluctuations by squeezing them out. By the end of 2019, all of the world’s LIGO 
detectors have been upgraded to use a quantum vacuum squeezer [412]. The use of such a 
device has improved the LIGO’s range by 15% and has allowed the observation of gravita- 
tional waves that would have been unobservable without the use of squeezed states. Other 
similar applications are emerging. Thus, it is worth looking at the fundamentals behind the 
squeezed-state concept. 

As we have discussed in Section 2.2, quantization imposes fundamental limitations on 
the accuracy with which we can simultaneously measure certain properties of a quantum 
device. Heisenberg’s uncertainty principle describes the constraint imposed on the vari- 
ances of two noncommuting Hermitian operators. For two such operators A and B, with 
the associated variances AA? = ((A — (A))”) and AB? = ((B — (B))*), the most general 
form of the uncertainty principle is given by [413] 
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AAZ AB? > T koa $ Dy], (7.212) 


where [A,B] = iC and (D) = (AB + BA) — 2 (A) (B). Clearly, (D) is a measure of the 
correlation between A and B. 

When no correlation exists between the operators A and B such that (D) = 0, we obtain 
the standard uncertainty relation (see Aside 2.11) 


1 
AA? AB? > 5 ‘Cy (7.213) 


For example, the position and momentum operators, for which [x, p] = ih (or (c) = h), 
obey Heisenberg’s uncertainty relation 


(Ax)(Ap) > h/2. (7.214) 


All preceding uncertainty relations show that one can reduce (or squeeze) the vari- 
ance of one observable, provided the variance of the other observable increases at the 
same time. For example, we can squeeze Ax to a relatively small value, provided the 
standard deviation Ap increases such that their product remains larger than h/2. This 
is because quantum mechanics imposes a constraint only on the product of these two 
quantities. 

The best example of a squeezed state comes from quantum optics through the so-called 
squeezed light [118]. The underline concept is best understood by considering the phasor 
representation of the electric field of light in one mode of the optical field. We write its 
complex amplitude as A = A, + iA; = |Ale’®, where the real and imaginary parts of A 
correspond to two different guadratures, often called the in-phase (I) and quadrature (Q) 
components. The uncertainty principle imposes the constraint (AA;)(AA;) > h/2 when A 
is expressed in suitable units. This means for a given measurement, the more precisely we 
know the in-phase part, the more potential error there is in the quadrature part (and vice 
versa). 

The quantum state of radiation emitted by a laser is represented by a coherent state, for 
which the variances of two quadratures are equal in magnitude. As shown in Figure 7.3, 
this gives rise to a circularly symmetric region of uncertainty for the two quadratures. By 
using specialized techniques, the circle can be squeezed into an ellipse, as shown schemat- 
ically on the right side in Figure 7.3. This deformation implies that uncertainty in the 


A, A 
A 4' AA 
AA, <> 
Q ja i 
> 
0 A, 0 i A, 
coherent state squeezed state 


The basics of the squeezed state. 
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A, quadrature is reduced, while the uncertainty in the less relevant quadrature A; has 
increased. Other kinds of squeezing are possible, depending on the physical parameter that 
exhibits noise below the shot-noise level. For example, photon-number squeezing occurs 
if the variance of the photon-number operator becomes less than the shot-noise level. We 
focus on quadrature squeezing in this section. 

When an optical field is quantized, its complex amplitude A is related to the expectation 
value of the annihilation operator å. The operators for the two quadrature components x 
and Y can then be introduced as 


p at i0 ^ 10 


X= ale? + àe”, Î = i(a'e® — ae"), (7.215) 


where 9 is an arbitrary angle. Using the commutation relation [@, ât] = 1, it is easy to show 
that [X,Y] = 2i for any value of is oe any quantum state |), Heisenberg’s uncertainty 
ee imposes the constraint ogo? > 1, where the noise variance is calculated using 
og = = (yÈ — X)*|w), where X = (y|X|w) is the mean value. In the case of a coherent 
state oy = oy. 

Figure 7.4(a) shows a coherent state schematically with the same amount of noise in its 
two quadratures, resulting in a circular shape of fluctuations around the mean value. It turns 
out that the nonlinear effects inside a medium can turn a coherent state into a squeezed state 
similar to that shown in part (b). Such states have the property that quantum fluctuations 
in one quadrature are reduced below those of a coherent state. In the case of Figure 7.4(b), 
fluctuations decrease in the Y quadrature but are enhanced in the X quadrature, resulting 
in an elongated noise ellipse. This is also evident from the temporal trace where amplitude 
noise is enhanced considerably. A phase-sensitive detection scheme such as heterodyne 
detection must be employed to observe noise reduction in one quadrature. 


re hat 
SW 


’ | (a) Coherent and (b) squeezed quantum states. Noise distribution along the two quadratures is shown together with 


the time dependence of the optical field. 
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A nonlinear process is required to transform a coherent state into a squeezed state. The 
use of four-wave mixing (FWM) for squeezing was suggested during the 1980s, and a 
detailed theory was developed in a 1985 paper [414]. In one implementation, a pump beam 
initiates the FWM process inside an optical fiber (acting as the nonlinear medium), but no 
signal is launched at the input end [415]. Rather, vacuum fluctuations provide the initial 
seed for the growth of the signal and idler fields. Such a process is called spontaneous 
FWM to differentiate it from its stimulated version. Squeezing occurs because the noise 
components at the signal and idler frequencies are coupled through the fiber nonlinearity. 
Mathematically, it is governed by the following two equations: 


das ton . 2af 

d = 16ds + 2iyA;ā; ; (7.216) 
dåi sie 

ae = 184; + 2iy Aâ}, (7.217) 


where ô provides a measure of the phase mismatch, y is the nonlinear parameter, and Ap, is 
the amplitude of the pump (treated classically). Vacuum fluctuations are included through 
the commutation relation [a;(z), ai (2) = dj~ (where j, k = s or i), which must be satisfied 
for all values of z. 

Equations (7.216) and (7.217) can be solved easily because of their linear nature. Their 
general solution is given by [416] 


as(Z) = Gs(0)[cosh(gz) + (i5/g) sinh(gz)] + i(y/g)Aa; (0) sinh(gz), (7.218) 
aj(z) = a;(0)[cosh(gz) + (i5/g) sinh(gz)] + i(y/g)Azal (0) sinh(gz), (7.219) 


where the parametric gain g is defined as g = (y7|Ap|4 — 6*)!/, This solution reduces to 
that given in Ref. [414] in the case of perfect phase matching (ô = 0). Note that the signal 
amplitude at a distance z inside the fiber evolves as a linear combination of a@;(0) and ât (0). 
It is this feature that is responsible for squeezing. The total field at the output end of a fiber 
of length L is given by 


Ai(t) = Ap(L) + âs(L) exp(—i Qt) + 4;(L) exp(i22), (7.220) 


where 2 = ws — wp is the signal’s detuning from the pump frequency wp. 

From a physical standpoint, squeezing can be understood as deamplification of signal 
and idler waves for certain values of the relative phase between them. A phase-sensitive 
detection scheme is employed, and the phase of the local oscillator is adjusted to change 
the relative phase. In practice, the pump itself is used as a local oscillator with an adjustable 
phase @. Its beating with the signal and idler fields at a photodetector generates an electric 
current whose noise power varies with both Q and 0. In a 1986 experiment, a 647-nm CW 
pump beam was launched into a 114-m-long optical fiber [417]. It was necessary to cool 
the fiber to liquid-helium temperature to overcome the noise produced by spontaneous 
Brillouin scattering. Cooling also reduced the threshold of stimulated Brillouin scatter- 
ing (SBS), which was suppressed by modulating the pump beam at a frequency much 
larger than the bandwidth of the Brillouin gain spectrum. Thermal Brillouin scattering 
from guided acoustic waves was still the most limiting factor in the experiment; it limited 
both the frequency range and the amount of noise squeezing. Squeezing was observed in 


260 


Quantum Noise 


two spectral bands located around 45 and 55 MHz but its magnitude was below 1 dB on 
the decibel scale. More recently, values as large as 10 dB have been realized. 

In a 2013 experiment, squeezing was observed by measuring fluctuations in two quadra- 
tures of the electromagnetic field generated by a tunnel junction acting as a quantum 
conductor [418]. It was necessary to cool the tunnel junction to 10 mK. Even then, the 
observed squeezing was limited to below 1 dB. It was predicted in a 2019 study that noise 
can be squeezed by as much as 12 dB through parametric amplification inside a super- 
conducting tunnel junction [419]. Considerable research has been done in recent years 
for investigating the statistics of photons in the nonclassical radiation emitted by a tunnel 
junction. Squeezing is expected to continue to play an important role in such studies. 
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